From Filesystems to the Web to Value-Embedded Collective Memory
Jonathan SaragossiCollective Memory
jonathan@collectivememory.ai
Abstract
We identify three paradigms of information organization, each defined by its atomic unit, its linking mechanism, and its method of determining relevance. The filesystem (1960s–) organizes files in hierarchical directories, linked by paths, with relevance determined by the user’s own schema. The web (1990s–) organizes pages in a flat network, linked by hyperlinks, with relevance determined by external appraisal (PageRank, algorithmic curation). We propose that a third paradigm — collective memory — is emergent: information organized as memories, geotemporally grounded media artifacts linked by value bonds (economic stakes) and semantic similarities (vector embeddings), with relevance determined by intrinsic valuation through attention markets and AI-augmented retrieval. Unlike the web, where value is assigned externally by platform algorithms, the memory paradigm embeds value directly into the content layer through a bonding-curve staking mechanism. We argue this architecture is structurally more resistant to manipulation, eliminates the need for a trusted centralized appraiser, and — through per-query attention augmentation — enables genuine epistemic pluralism over a shared dataset. We ground these claims in Wittgenstein’s account of meaning-as-use and Nietzsche’s perspectivism, and situate them within the longer tradition of decentralized information architecture running from Vannevar Bush’s memex through Ted Nelson’s Xanadu. We conclude by addressing unresolved tensions: plutocratic bias, cold-start problems, temporal decay, and the political economy of attention commodification. Keywords: information architecture, attention economy, collective memory, decentralized curation, bonding curves, vector search, retrieval-augmented generation, epistemic pluralism, perspectivism, Wittgenstein1. Introduction
Every era of computing has been defined not by its hardware but by how it answers three foundational questions about information:- What is the unit? What constitutes a single, addressable piece of information?
- What connects units? How are individual pieces related to one another?
- Who decides what matters? How is relevance determined, and by whom?
2. Three Paradigms of Information Organization
2.1 Before the Paradigms: The Memex and Its Unrealized Promise
The intellectual prehistory of information architecture begins not with filesystems but with Vannevar Bush’s 1945 essay “As We May Think,” in which he proposed the memex: a hypothetical device in which an individual stores all their books, records, and communications, accessible with exceeding speed and flexibility. The memex was to be organized not hierarchically but through associative trails — links between documents that mimicked the associative leaps of human cognition. Bush’s vision anticipated both the web’s linking mechanism and the memory paradigm’s emphasis on personal, grounded information artifacts. What neither Bush nor the web ultimately realized was the coupling of value to content — the idea that an artifact might carry within itself a signal of its importance to a community. This is the specific contribution of the memory paradigm. Ted Nelson’s Xanadu project (1960–) pushed further, proposing bidirectional links, transclusion, and micropayment systems. The micropayment element — the idea that reading content should entail an economic transaction with its creator — is structurally adjacent to the bonding-curve staking of the memory paradigm, though the mechanisms differ fundamentally. Nelson’s vision remained largely unrealized; we argue that the technical conditions for realizing something like it now exist.2.2 The Filesystem: Sovereignty and Enclosure
The filesystem, formalized in Multics (1965) and Unix (1971), introduced the foundational computational metaphor: information is a file, files live in directories, and directories nest hierarchically. The path/home/user/photos/berlin/2024/protest.jpg is both an address and a description — it encodes the owner, the subject, the approximate time, and the kind of content. The filesystem is, in this sense, a theory of categories made operational.
Relevance in a filesystem is determined by the user’s own organizational schema. There is no external authority on what matters. Search is local: grep, find, filename matching. The filesystem is epistemically sovereign — each user’s directory structure is a private ontology, a personal theory of what categories exist and how they relate. This sovereignty is both the paradigm’s strength and its fundamental limitation.
The limitation is structural: no file knows about any other file unless someone explicitly creates that knowledge. A photograph of a civil rights demonstration in one archive has no connection to a photograph of the same event in another. There is no shared semantic layer. Knowledge is not merely private — it is, in a precise sense, imprisoned in the ontology of whoever created the directory structure.
The philosophical model implicit in the filesystem is what Wittgenstein called the private language of the Tractatus Logico-Philosophicus (1921): the idea that meaning is a fixed relationship between a linguistic expression and a fact in the world, and that this relationship is established by the individual speaker. The directory structure is a private language — a naming system whose logic is fully transparent only to its author.
2.3 The Web: Public Linking and the Problem of the External Appraiser
The web, proposed by Berners-Lee (1989) and realized through HTTP and HTML, replaced hierarchical paths with hyperlinks — associative, non-hierarchical references from any page to any other page. The atomic unit became the page, and the linking mechanism became the hyperlink. This was a profound structural shift. For the first time, information units could reference each other across organizational and institutional boundaries. But the web immediately introduced a new problem: with billions of pages and no hierarchy, who decides what matters? The filesystem’s answer — the user decides — does not scale to a global information commons. Some mechanism for surfacing relevance across the entire graph was needed. The answer was the external appraiser. Google’s PageRank (Brin and Page, 1998) treated hyperlinks as votes: a page linked to by many pages, especially by pages that are themselves highly linked, is more relevant. This was an elegant solution, but it introduced a structural dependency with far-reaching consequences. Relevance was no longer a property of the content itself, nor of the user’s own judgment, but of the graph of references around the content, as computed by a centralized actor with its own interests and incentives. The consequences of this architecture have been extensively documented and are now well-understood. We summarize them here because understanding them precisely is necessary for understanding what the memory paradigm proposes to solve — and what it does not.- Manipulation: The gap between the appraisal mechanism (links, keywords, engagement metrics) and genuine value creates an attack surface. Search engine optimization, link farms, content mills, and bot-driven engagement inflation all exploit this gap. The cost of manipulation is low relative to the benefit, because the signal (links, clicks) is cheap to fake.
- Centralization of epistemic authority: A small number of platforms — Google, Facebook, Twitter/X, TikTok — become the de facto arbiters of what information reaches whom. Their ranking algorithms are the practical epistemology of the internet for billions of people, yet their internal workings are opaque, proprietary, and subject to change without notice or explanation.
- Homogenization: Algorithmic optimization for engagement tends to converge on a narrow band of content types. Pariser’s (2011) filter bubble analysis documented this tendency; subsequent research has both confirmed and complicated the picture. The important structural point is that homogenization is an architectural tendency, not a correctable bug.
- Decoupling of value from content: A page’s discoverability — its effective value to the information ecosystem — lives not in the page itself but in Google’s index, Facebook’s social graph, or Twitter’s recommendation engine. Remove the platform, and the content’s epistemic standing evaporates. This is a fragility of the first order.
2.4 The Memory Paradigm: Intrinsic Value and Semantic Bonds
We propose that a third paradigm is emerging, one in which three structural innovations combine to address the core limitations of both previous paradigms while introducing new ones. First, the unit is a memory: not a file (which is format-defined) or a page (which is link-defined and typically document-structured), but a geotemporally grounded media artifact. A memory has an inherent where and when — it is indexical, pointing to a specific moment and location in the world. It is not merely a document about something; it is a record of a particular instance of the world. It also carries AI-generated semantic metadata (description, tags) and a vector embedding that positions it in a continuous high-dimensional semantic space. Second, links are value bonds and semantic similarities. Memories are connected not by explicit authorial links or hierarchical containment, but by two mechanisms operating simultaneously. Economic bonds: when a user stakes tokens on a memory, they create a value link — a financial claim that connects their identity and resources to that memory. The staking graph is a value network analogous to the web’s link network, but with real economic commitment behind each edge. Semantic bonds: vector embeddings create implicit connections between memories based on content similarity. These connections are not authored by anyone — they emerge from the geometry of meaning in a high-dimensional space. Third, relevance is intrinsic. A memory’s value is not determined by an external algorithm but by the aggregate economic commitment of the community, encoded directly in the memory’s own state. The staked amount — the total attention tokens committed to a memory — is a property of the memory itself. Remove every search engine, every recommendation algorithm, every platform: the memory still carries its value. This portability of value is, we argue, the most significant architectural innovation of the memory paradigm.3. The Architecture of Intrinsic Value
3.1 Bonding Curves as Epistemic Incentive Structures
In the memory paradigm, value is not assigned to content by an external authority. It is discovered through a market mechanism: a bonding curve that governs the price of staking attention on a memory. The price of staking follows a sublinear power function: Where P₀ is the base price, α is the curve coefficient, v is the total value locked (principal plus revenue reserves), and β < 1 is a sublinear exponent (empirically set to approximately 0.6 in current implementations). The sublinearity of β < 1 is epistemically crucial, and its importance extends beyond pricing mechanics. It means that early stakes are cheap and late stakes are expensive. This creates a discovery incentive: the economic reward for identifying a valuable memory before the community has recognized its value is structurally greater than the reward for confirming an already-recognized judgment. The bonding curve is not merely a pricing mechanism — it is an incentive structure that rewards original judgment over conformity. Compare this with PageRank’s incentive structure. PageRank rewards linking to pages that are already authoritative — a page linked to by high-PageRank pages passes more value. This creates a conservative, self-reinforcing dynamic: the already-authoritative becomes more authoritative, and the unknown remains unknown. The bonding curve inverts this: the already-staked is expensive to stake further, and the unstaked is cheap. The former incentivizes conformity; the latter incentivizes exploration and contrarianism. This structural parallel to Zahavi’s (1975) handicap principle in evolutionary biology and Spence’s (1973) signaling theory in economics is not incidental. The epistemic power of the staking signal derives precisely from its costliness. A signal that is cheap to produce is easily faked; a signal that requires committing actual capital provides genuine information about the staker’s beliefs.3.2 Value Portability and the Absence of the Appraiser
In the web paradigm, content and valuation are architecturally separated. A web page exists at a URL; its relevance exists in Google’s index. The page does not know its own PageRank. If Google disappears, the content survives but its discoverability — its epistemic standing within the information ecosystem — evaporates entirely. In the memory paradigm, the valuation is part of the content’s own state. The staked amount, the principal reserve, the revenue reserve, and the ownership structure are all properties of the memory artifact itself, stored alongside the media URL and AI-generated metadata. No intermediary is needed to determine a memory’s worth — any search system can read the staked amount and use it as a relevance signal. The value is portable. This architectural difference has profound implications for resilience and censorship resistance. A regime that wishes to suppress a document in the web paradigm can achieve this by pressuring Google to deindex it — the content survives but becomes effectively invisible. In the memory paradigm, the memory carries its community’s valuation with it, and this valuation is a distributed economic fact that cannot be simply zeroed out by deindexing. The suppression of memory becomes genuinely harder.3.3 Manipulation Resistance: A Formal Analysis
Web-era manipulation exploits the architectural separation of content from value. SEO manipulates the signals (links, keywords, anchor text) that external appraisers use without changing the content itself. Social media manipulation inflates engagement metrics that feed into recommendation algorithms. In both cases, the cost of the manipulation is low relative to the potential benefit, because the signal is cheap to produce. In the memory paradigm, manipulation requires genuine economic commitment. We identify four structural properties that make this self-limiting: Cost escalation: the sublinear bonding curve means that each additional unit of stake costs more than the last. Inflating a memory’s value from a low to a high level requires capital investment that grows superlinearly with the desired level of inflation. Capital lock-up: staked tokens are committed to the memory’s reserve. The attacker’s capital is trapped until they redeem, and redemption at an artificially inflated price returns less than invested if no other participants validate the stake by staking further. Creator fee leakage: a percentage of every stake accrues to the memory’s creator as an irrecoverable fee. Manipulating a memory one did not create imposes a continuous cost proportional to the size of the manipulation. Transparent stake distribution: other participants can observe the staking pattern. A memory with a single massive stake but no subsequent community interest is a legible signal of manipulation, not genuine value. AI-augmented search can weight staker count and stake distribution alongside raw staked amounts, making concentrated manipulation more easily identifiable.| Signal Type | Mechanism | Cost to Fake | Epistemic Quality |
|---|---|---|---|
| Hyperlink (web) | Authorial reference | Low (SEO, link farms) | Weak — easily gamed |
| Engagement metric (web) | Click / dwell / share | Very low (bots) | Very weak — behavioral, not deliberate |
| Economic stake (memory) | Token lock-up on content | High (capital at risk) | Strong — costly, deliberate, auditable |
| Semantic embedding (memory) | Vector geometry | None — structural, not gamed | Structural — encodes meaning not intent |
4. Epistemic Pluralism Through Attention Augmentation
4.1 The Epistemological Commitment of Search
Traditional search engines make an implicit epistemological commitment: that for any query, there exists a single correct ranking of results. Google’s results for “climate change” are the same whether you are a glaciologist, a fossil fuel lobbyist, a smallholder farmer in Bangladesh, or a politician seeking reelection. The algorithm presents one ordering as the answer — a claim to objectivity that is, on reflection, a philosophical position of considerable audacity. This position has come under sustained critique from the philosophy of science (Longino, 1990; Haraway, 1988) and from critical algorithm studies (Noble, 2018; Benjamin, 2019). The core objection is not that objectivity is impossible or undesirable, but that the pretense of a view from nowhere conceals the particular perspective (institutional, commercial, cultural) from which the algorithm was designed and the data from which it learned. A ranking that presents itself as objective is more dangerous than one that presents itself as a perspective, because the former forecloses the question of whose interests it serves.4.2 Wittgenstein: Meaning as Use and the Language Game of Relevance
Wittgenstein’s later philosophy, particularly the Philosophical Investigations (1953), provides a rigorous framework for understanding why singular rankings are epistemically impoverished. His central claim is that the meaning of an expression is not a fixed relationship between that expression and a feature of the world, but is constituted by the practices — the language games — in which it is embedded. “The meaning of a word is its use in the language” (PI §43). Applied to information systems, the relevance of a memory is not a fixed property of that memory or of its relationship to a query. It is constituted by the use context: who is searching, what they intend to do with the results, and what community of practice they belong to. A photograph of a street market in Lagos has different relevance for a food anthropologist, an urban economist, an architect studying informal settlements, and someone researching a travel itinerary. These are not merely different preferences over the same ranking — they are different language games in which the same content participates differently. Wittgenstein’s concept of family resemblance (PI §67) is also directly instantiated in vector-based semantic search. When the memory paradigm finds memories similar to a query, the similarity is not defined by a single shared property (as in classical categorization) but by a network of overlapping and criss-crossing features — exactly the “complicated network of similarities overlapping and criss-crossing: sometimes overall similarities, sometimes similarities of detail” that Wittgenstein describes. The high-dimensional embedding space is, in a precise mathematical sense, a space of family resemblances: proximity in the space reflects not identity of a single feature but overlap across many features simultaneously.4.3 Nietzsche: Perspectivism as Engineering Constraint
Nietzsche’s perspectivism — the view that there are no facts, only interpretations, and that every apprehension of the world is from a particular perspective with particular interests — is frequently dismissed as relativism or nihilism. In the context of information architecture, it is neither. It is a precise and actionable design principle.“There is only a perspective seeing, only a perspective ‘knowing’; and the more affects we allow to speak about one thing, the more eyes, different eyes, we can use to observe one thing, the more complete will our ‘concept’ of this thing, our ‘objectivity,’ be.” — Nietzsche, On the Genealogy of Morals, III.12Nietzsche’s claim is not that all perspectives are equally valid, but that richer knowledge comes from holding multiple perspectives simultaneously rather than collapsing them into a single authoritative view. Genuine objectivity, on this account, is not the elimination of perspective but the accumulation of many perspectives. The web paradigm, with its single authoritative ranking, implicitly adopts what Nietzsche would call the dogmatist’s error: the belief that there is a single correct view, and that the task of epistemology is to identify and adopt it. The memory paradigm proposes something structurally different: a shared dataset over which multiple perspectives — each backed by economic commitment — can be held simultaneously and surfaced differently to different participants.
4.4 The Augmentation Mechanism: Perspectivism Implemented as Protocol
The memory paradigm implements Nietzsche’s perspectivism as a concrete technical protocol through what we call attention augmentation. Each search result carries multiple value layers:| Value Layer | Definition | Epistemic Role |
|---|---|---|
| Public Value | Total ATTN staked by all users | The community’s collective appraisal |
| Local Value | Relevance weight for this specific query | The query context |
| Augmented Value | Personal adjustment from the searcher’s own staking history | The individual’s interpreted perspective |
| Emergent Value | Synthesis of all three via attention market resolution | The arbitraged, pluralistic result |
4.5 Diversity as Structural Property, Not Policy Intervention
In the web paradigm, diversity of viewpoint is a moderation challenge — something platforms must engineer against their own algorithmic tendencies toward homogenization, typically through explicit editorial policy or regulatory pressure. Diversity is not what the architecture wants; it is what external pressure occasionally achieves. In the memory paradigm, diversity is a structural property. The bonding curve rewards contrarian staking: because early stakes are cheap, there is an economic incentive to find and commit to undervalued memories — content that the majority has overlooked. This structurally promotes epistemic diversity: the system pays you to disagree with the consensus, if your disagreement proves prescient. Creator incentives reward diverse content: the creator fee means that observers in underserved regions, citizen journalists, and practitioners in niche domains have a direct financial incentive to capture and share memories. The more unique and underrepresented the content, the higher its potential to attract contrarian staking.5. The Semantic Layer: AI as Infrastructure, Not Authority
5.1 The Role Distinction
In the web paradigm, AI (in the form of recommendation algorithms, ranking systems, and increasingly large language models) functions as an epistemic authority: it decides what you see, in what order, and with what framing. The algorithm’s judgment substitutes for the user’s. Even when the system is personalized, the personalization is an algorithmic intervention, not an expression of the user’s own commitments. In the memory paradigm, AI plays a fundamentally different role. It functions as semantic infrastructure: it creates the vector space in which human-driven value curation operates, without itself determining which content is valuable. The pipeline is:- A memory is uploaded — a human act of observation and documentation
- A multimodal AI model analyzes the visual content and generates a textual description and semantic tags — AI enrichment
- The text is embedded into a high-dimensional vector space — AI infrastructure
- The vector is stored alongside the memory’s metadata in a distributed index — semantic indexing
- Users stake tokens on memories — human acts of valuation and interpretation