256t.org is a domain dedicated to be a public specification for a specific type of content addressable storage. In this scheme the last element of up to 94 characters in a URL path defines the content at that URL. At some point in the future, it may evolve to also be a public utility for publishing content using the scheme. However, that is currently beyond the scope if this site.
Why 256t.org? A simple standard for a generic content addressable store seems generally useful to me.
Every 94 character path can be used to retrieve content that matches the length and hash specified. If no content is available a 404 is returned instead.
A SHA-512 hash is 512 bits or 64 bytes long. 64 bytes can be stored in a 86 character base64 string.
(64 * 8) / 6 = 85.333... ~= 86
Content of 64 bytes or less can be stored directly in equal or lesser space. In such cases, the content itself should be base64 encoded and used with a minimum of padding rather than using its hash.
An 8 character base64 string can store 48 bits.
2^48 = 2^40 * 2^8
= 2^10 * 2^10 * 2^10 * 2^10 * 2^8
= 2^8 * 2^10 * 2^10 * 2^10 * 2^10
= 256 K M G T
The following can be treated as true enough despite being false: - A 94 character path uniquely determines content. (However, this is completely true for content less than 64 bytes.) - The content is immutable. (It could be replaced by different content that still meets the description.) - Content can be safely cached indefinitely.
Thus all HTTP meta information such as headers and eTags will indicated that the content can be cached indefinitely.
The 94 character content tag consists of an 8 character length prefix followed by a 86 character hash.
| length of the content | hash of the content | the content iteself | |
|---|---|---|---|
| when | always | length(content) > 64 | length(content) <= 64 |
| start | 1 | 9 | 9 |
| end | 8 | 94 | 94 |
| length | 8 | 86 | 0 to 86 |
| format | Base64 | Base64 | Base64 |
| info | length(content) | sha-512(content) | content |
More specifically, filename and URL safe Base64 aka base64url.
This 94 character or less base64 string which identifies content will be referred to as a content identifier or CID.
Any server could expose a base URL with contents that adhere to this spec. Alas, what servers host what content and how to find them is beyond the scope of this text. This server hosts a small set of CIDs here.
These things are beyond the scope of this text.
For information about deploying and publishing content-addressed storage using CIDs, see Publishing and Storage.
Have questions about 256t.org? Check out the Frequently Asked Questions page for answers to common questions about CIDs, implementation, security, and more.
There are a few different types of collisions that are important to distinquish between: - accidental -- purely by chance - adversarial -- someone tried to cause it - existing -- a CID has been produced from different content - problem -- usage of the CID to get content returned the wrong content
The odds of a problem collision are quite low. There are two ways to minimize them: - Always verify CID content. There are many implementations to do so. It is easier to just lie than engineer a collision. - Reduce adversaries. If nobody is putting problem content where you might accept it, you are left with just accidents.
I'm comfortable just ignoring accidental problem collisions.
The 256t.org specification has been implemented in multiple programming languages. Each implementation provides utilities to generate and verify content identifiers (CIDs).
| Language | Badge | Code |
|---|---|---|
| Bash | bash | |
| C | c | |
| C++ | cpp | |
| C# | csharp | |
| Clojure | clojure | |
| ClojureScript | clojurescript | |
| CMake | cmake | |
| Crystal | crystal | |
| D | d | |
| Dart | dart | |
| Deno | deno | |
| ECMAScript | ecmascript | |
| Elixir | elixir | |
| Elm | elm | |
| Emacs Lisp | emacs-lisp | |
| Erlang | erlang | |
| F# | fsharp | |
| Fortran | fortran | |
| Go | go | |
| Groovy | groovy | |
| Haskell | haskell | |
| Java | java | |
| JavaScript | javascript | |
| Julia | julia | |
| Kotlin | kotlin | |
| Lua | lua | |
| Nim | nim | |
| Node.js | node | |
| OCaml | ocaml | |
| Perl | perl | |
| PHP | php | |
| PowerShell | powershell | |
| Prolog | prolog | |
| Python | python | |
| R | r | |
| Racket | racket | |
| Ruby | ruby | |
| Rust | rust | |
| Scala | scala | |
| Swift | swift | |
| Tcl | tcl | |
| TypeScript | typescript | |
| Unison | unison | |
| Zig | zig |