seekia/documentation/Documentation.md

818 lines
65 KiB
Markdown

# Seekia Documentation
![Seekia Banner](../resources/markdownImages/seekiaLogoWithSubtitle.jpg)
### Welcome to the Seekia documentation!
Thank you for being interested in genetics aware mate discovery technology.
This document is a technical description of how Seekia works.
Seekia will continue to evolve along with this document.
Read the whitepaper `Whitepaper.pdf` to understand more about the philosophy and motivations for Seekia.
## What is Seekia?
Seekia is a mate discovery network where users can find a mate while having a deep awareness of each potential mate's genetics.
Users can share genetic information in their profiles such as monogenic disease risks, allele values, and ancestry.
Seekia can calculate the genetic attributes for prospective offspring between users. Seekia enables users to browse and filter potential mates by their genetic attributes and the predicted genetic attributes of their offspring. Seekia allows for users to predict and control the genetic attributes of their offspring by selecting a mate who is the most likely to produce offspring with their desired attributes.
Users can analyze their genomes using the Seekia app to learn about monogenic disease probabilities, polygenic disease risk scores, and traits. Users can share this information in their profiles. Seekia enables users to choose their mate in such a way to prevent their offspring from having monogenic diseases, reduce the probability of their offspring having polygenic diseases, and increase the probability of their offspring having certain traits.
Users can view information about the health and physical traits of their prospective offspring for each user. Users can choose to mate with users with whom their offspring has a lower probability of having diseases and a higher probability of having certain traits.
Seekia aims to increase the genetic desirability of humanity by making humans healthier, more beautiful, more intelligent, more virtuous, and happier. Seekia aims to facilitate eugenic breeding by helping to create mate pairings which are the most likely to produce healthy, beautiful, intelligent, virtuous, and happy offspring.
The goal of Seekia is to accelerate the world's adoption of genetics aware mate discovery technology, and to help the world mate in a genetics aware manner.
## User Identities
### Identity Type
There are three identity types: Mate, Host, and Moderator.
### Identity Key
Each Mate, Host, and Moderator identity has an associated ed25519 public/private key pair.
### Identity Hash
The public identity key is hashed to create an identity hash.
An identity hash is 16 bytes long: (15 byte hash of the identity key, 1 identity type byte)
The first 15 bytes are encoded in Base32, and the final byte is encoded with a single letter representing the identity type.
Users must always be referred to by their identity hash, because two identity keys can be created that generate the same identity hash.
## Network
The Seekia network is comprised of hosts and clients.
Hosts can be run anonymously by anyone, and can leave and rejoin at will.
Hosts are responsible for serving profiles, messages, reviews, reports, and parameters.
Hosts can host as much or as little content as they want.
Users connect to hosts to download profiles and messages to browse.
### Network Type
Seekia currently has 2 network types: Mainnet and Testnet 1.
Each network type is described by a single byte. Mainnet == 1, Testnet1 == 2.
Multiple networks allow for the testing of new features on a test network before deploying them on the main network.
Each network has its own payment proof providers, network entry seeds, and parameters. Profiles, messages, reviews, reports, and parameters all contain a network type byte.
Users can switch their app's network type. Upon switching network types, the Seekia client will interface with the new network and delete downloaded database content from different networks. User data such as messages and chat keys are retained, so users can switch between networks without losing sensitive data.
Users can maintain a presence on multiple networks at the same time. For example, a user could use the same Seekia identity to host from two different machines, each on a seperate network.
### Ranges
Hosts describe the data they are hosting by sharing their ranges. Ranges are also used in network requests to requests portions of the network.
There are two types of ranges: Identity and Inbox.
A range is comprised of two bounds: Start and End.
Identity ranges bounds are of the type `[16]byte`, and inbox ranges are of the type `[10]byte`.
To determine if an identity or inbox is within a range, the bytes of the identity/inbox are compared with the Start and End bounds to see if they fall within both bounds.
The code which implements ranges can be found here: `internal/byteRange/byteRange.go`
A host will host all of the messages which were sent to an inbox within the host's Inbox bound, and a host will host all profiles whose author is within the host's Identity bound. The host may have other restrictions as well, such as only choosing to host Viewable profiles. A host has a different range for each Identity Type.
### Required Downloads
To interface with the Seekia network, all users must download the network parameters.
All hosts and moderators must download all moderator identity reviews and moderator profiles. Downloading all moderator identity reviews and moderator profiles is necessary to determine which moderators are banned. Knowing which moderators are banned is necessary to determine identity, profile, attribute and message verdicts. Moderator profiles must be downloaded to determine which moderators are disabled.
Hosts must download and host all reviews and reports for the identities/profiles/attributes/messages they are hosting. For example, if a host is hosting identities a-b, they must download all reviews and reports reviewing/reporting those identities, and all reviews/reports reviewing/reporting profiles and attributes authored by those identities.
All host profiles are downloaded by all hosts of Seekia.
If hosts are hosting Host/Moderator identities, they must host all of them and their profiles/reviews/reports, rather than a subrange. This is because all Host/Moderator profiles, reviews, and reports should always be a small enough size that breaking them up between hosts is unnecessary.
### Host Identity Balance
Each host identity must be funded with a minimum amount of cryptocurrency to participate in the network.
The gold rate/day in cost is defined by the network parameters.
Hosts can be banned by moderators, so the funding requirement deters spam and unruleful behavior.
### Host Profiles
Each Seekia host has a profile which includes the following:
* Their IP address/Tor hidden service address
* Information about what content they are hosting, such as which types (messages/profiles/...) and the ranges of inboxes/identities they are hosting
* The size of the content they are hosting
Host profiles expire after an inactivity period provided in the parameters, so hosts must update their profiles at least that often. Host profile updates are performed automatically by the application for users in Host mode.
### Host Options
#### Clearnet/Tor Mode
Hosts can disable either Tor or Clearnet, or allow both.
Having clearnet-only hosts is necessary in the event that many host's Tor hidden services are DDOS attacked.
Clearnet addresses are more resilient against DDOS attacks because they can utilize anti-DDOS services.
Clearnet-only hosts are also needed to be the entry nodes to the network.
Clearnet hosts are also able to download data from other hosts over clearnet, which is much faster than Tor.
### Download Privacy
The Seekia application attempts to query data from hosts in a privacy-preserving manner.
For example, when downloaing sensitive content such as a user's messages, a new Tor circuit is created for each inbox to prevent hosts from linking user inboxes together.
#### Requestor Fingerprint
Each request contains information about a requestor, which could be used to link a requestor to a Seekia user and/or understand their behavior.
A simple example is a host, who requests profiles within a specific range, and shares that range on their public host profile. If that host is the only host using that range, any download requests made for that range will be trivially linkable to that host. This is why requests must be broken up by type, sent to different hosts, and crafted in such a way that hosts learn as little as possible about the requestor.
Imagine the example of a user who wants to check for new profiles for 100 downloaded mate profile authors. If they send those 100 authors in a single request, that host now knows that requestor has 100 mate profiles created by those authors downloaded. After offering a list of profiles, hosts will also be able to tell which profiles the user has already downloaded by seeing which profiles the user skips downloading.
The host can use the location of the profiles and their attributes to try to match the requestor to a specific Seekia user. This is easier to accomplish if Seekia is used by a small number of people in a specific region.
The most private way to make this request would be to request each author's profiles individually over a new tor circuit from a different host each time, but it would take a much longer time. In this one-by-one strategy, requestor privacy is improved, but download time is increased.
#### Mate Downloads Criteria
A Seekia Mate user can select their Download Desires within the Seekia app. A user's download desires are used to create their Criteria.
A user's Download Desires are the desires the user is willing to share with hosts when they request Mate profiles to browse. The more attributes they select, the less undesired profiles they will download, saving time and bandwidth.
This feature exists because users may not want to share desires that are embarrassing or too personal. The app initializes with Age, Sex, and Distance selected. These are attributes for which most users would not care if their desires were publicly known. In the worst case, a malicious host could link a requestor's criteria to their Seekia identity and share their criteria somewhere.
If a user does not select any download desires, their client will download all Seekia mate profiles to their machine. All Mate profiles could eventually take up terabytes of data, which would be too large for most users, so the client should warn users who do not select any download desires.
An advantage of using Downloads Criteria is that it allows the application to make requests to hosts regarding all users who fulfill a user's criteria. For example, one kind of request involves checking the funded status of all profiles which fulfill a user's criteria. The user does not care if the requestor learns that they have those profiles downloaded, because the host would only be able to learn their criteria, not their private desires.
Mate users who do not fulfill a user's criteria but still need to have their profiles downloaded are called Outliers. Examples of Outliers include users that have messaged a user, a user's contacts, and a user's Liked users.
#### Desires Pruning Mode
In normal Mate mode operation, Seekia will prune profiles that do not fulfill a user's downloads criteria.
If the application instead deleted all profiles that do not fulfill a Mate user's private desires, then hosts could trivially learn their private desires by matching commonalities between the profiles that the user requests information about.
If the Mate user's machine has run out of space, and pruning profiles based on criteria is not enough, Seekia will enable **Desires Pruning Mode**.
In this mode, the application will prune profiles which do not fulfill the user's private desires to save storage space.
The user will still make requests based on their downloads criteria, but must download **all** of the profiles that they are offered, rather than being able to skip downloading the profiles they already have downloaded. If the requestor rejected profiles they already have downloaded, then the host would learn all of the user's private desires by matching commonalities between the profiles that the user does not reject.
In Desires Pruning Mode, the application can never stop downloading and discarding the same profiles over and over again. When Desires Pruning Mode is disabled, the application can eventually download most of the profiles it needs, and from then on only download new profiles as they are broadcast to the network.
In Desires Pruning Mode, the requestor only downloads a small random sample of profiles from each host, and then skips to another host. Otherwise, they would get stuck downloading from a small number of hosts.
In Desires Pruning Mode, requests that require a user's stored profiles/identities to be shared must be instead request information about each profile/identity one-by-one. For example, when getting the funded and viewable statuses for a user's stored profiles, the status for each profile must be requested on its own. This will be slower, but only profiles which fulfill a user's private desires, and whose author's newest profile fulfills our desires, will need their status to be downloaded, which will reduce the number of requests which need to be made.
Desires Pruning Mode will only be activated if a user does not have enough storage space to store all of the profiles which fulfill their downloads criteria. This would be more likely if they are not sharing many desires, or if they live in an area with a large number of Seekia users whom fulfill their downloads criteria. Most users should never need to enter Desires Pruning Mode, because their downloads criteria should limit the total profiles to download to a small enough size.
Desires Pruning Mode must also be disabled carefully. If the mode is suddenly disabled, then the requestor will leak their locally stored desire-pruned profiles to the first few hosts they request from. The requestor must first download enough profiles without rejecting any, and then they can start to reject profiles which they have already downloaded.
#### Invalid Data Fingerprinting
Here is another fingerprinting attack:
1. Requestor requests profiles which fulfill a provided Mate criteria.
2. Malicious host responds with some profiles that do not fulfill the provided criteria.
3. Requestor requests to download the profiles the requestor is missing, and declines to download the profiles the requestor already has.
The host now knows some profiles that the requestor has that are not within the requestor's downloads criteria. These profiles could be profiles that the requestor is hosting, or contacts that do not fulfill the requestor's downloads criteria. The host could then use this information to identity the requestor's identity, whom the requestor is contacting, and the requestor's host identity.
To avoid this from happening:
1. Upon receiving the response, the requestor will check if any profiles that the host is offering are already downloaded and do not fulfill the request's criteria.
2. The requestor downloads the criteria non-fulfilling profiles, and declines to download any criteria-fulfilling profiles the requestor already has.
3. After the entire download exchange is completed, the user adds the host to their malicious hosts list.
The requestor should only add the host to their malicious hosts list after the entire request has completed. If they instead rejected the host immediately, the host would know that they had some of these profiles stored. Otherwise, the requestor would have no way of knowing that these profiles did not fulfill their request criteria.
This way, the requestor leaks no unintended information about which profiles they have and the requestor still is able to download whatever valid profiles the malicious host was offering.
## Network Parameters
The Seekia network relies on a set of files called the Network Parameters.
These files contain various data, some of which is necessary for the functioning of the Seekia network.
Some of the parameters exist to help the network adapt to changing conditions, such as currency exchange rates and moderation parameters.
The files are signed and controlled by the network admin(s).
The **AdminPermissions** parameters file defines which admins have permission to sign each network parameters type. The AdminPermissions file can only be authored by the master admin(s), whose public key(s) are encoded into the code of the Seekia application.
All Seekia clients must download the network parameters to be able to host, moderate, or chat.
See `Specification.md` to see each network parameter type and its purpose.
## Blockchain Data
Seekia hosts can choose to provide blockchain data.
A Seekia host who is providing a blockchain's data is offering to share information about deposits made to addresses.
These deposits are only used to calculate moderator identity scores.
Blockchain data is requested from multiple hosts to defend against malicious hosts who share invalid deposit data.
### Relying on a local blockchain node
Seekia users and hosts can host their own cryptocurrency blockchain to eliminate the need to query other hosts for address deposits.
This will make identity scores appear and update faster, and will make the user's Seekia usage more private.
This is useful for any moderators who are already running a node for any blockchain used by Seekia.
Once more cryptocurrencies are added, most blockchain-storing users will probably only have 1 local blockchain node, relying on network hosts for the other blockchain deposits.
### Banning Malicious Hosts
Moderators who are running their own blockchain node can audit the information hosts are providing by making blockchain requests to hosts and comparing their responses to their own node's ledger. These moderators can ban hosts who provide invalid deposit information. The app should be able to do this automatically.
## Funding Content
All Seekia messages, mate profiles, and reports must be funded to be hosted by the network.
Identities must be funded to have their profiles hosted by the network.
Funding is required to prevent network spam and discourage bad behavior.
Without a financial cost to broadcasting content, a single actor could spam the network with billions of fake profiles/messages, rendering the network useless. By requiring funds, broadcasting spam costs an attacker money.
Seekia users can be banned if they engage in malicious behavior, so being a malicious user will cost money.
Users spend enough cryptocurrency to have their profile hosted for as long as they initially desire. For example, if a Mate user wants to try out Seekia for 60 days, they fund their Mate identity for 60 days. They can extend their identity's balance any time they want.
Users must initially fund their mate/host identity for a minimum number of days. This only needs to be done once per each identity. This is required to prevent an attacker from funding many identities for a very small amount of time (1 hour), and flooding the network with profiles. Moderator identities are funded via Moderator Scores, which are described later in this document. Anyone can fund another user's identity, which is useful if that user's identity is close to expiring.
Each mate profile must be funded individually for a flat fee. Without this, an attacker could replace their identity's mate profile thousands of times, which would spam the moderators with profiles to review. Host and moderator profiles do not have this issue, because these profiles do not need to be approved by the moderators. Host and Moderator profiles can be banned or approved, but they do not need to be approved before being downloaded or viewed by users.
Reports and messages must each be funded individually. Reports use a flat fee, whereas messages are funded based on their size and network duration. Larger messages are more expensive.
The costs to fund identities/profiles/messages/reports are defined in the network parameters. All of the parameter costs must be updated in a way that allows a time period for all clients to update their parameters. Otherwise, some user clients will overpay/underpay because they have outdated costs.
If the spam on Seekia started to increase, the network admins would increase the costs. A perfect balance must be achieved which reduces the amount of spam and unrulefulness but keeps the cost low for users to participate.
To determine the funded status of an identity/profile/message/report, hosts and users request the information from hosts who provide blockchain information.
## Funding Content
Seekia uses cryptocurrency to fund identities, mate profiles, reports, and messages on the network.
Non-moderator identities, reports, mate profiles, and messages are all funded using payment proofs.
*TODO: Explain merkle tree payment proofs. An explanation currently exists in Future-Plans.md.*
The price of funding content/identities on the network is represented in gold. To calculate the amount of funds sent to a particular address, Seekia multiplies the amount of crypto sent in each payment by the gold exchange rate at the transation time described in the parameters to determine the total amount of gold sent to the address.
Moderator identity scores are not funded with payment proofs. Moderators use crypto addresses derived from their identity hash. This makes it easier to calculate a moderator's identity score, as downloading payment proofs is not necessary. It also reduces the amount of data that the Seekia network needs to maintain in perpetuity. Moderators should use blockchain privacy tools to fund their identity scores to avoid linking their crypto wallets with their moderator identity.
All Seekia clients get their hosted message/profile/report funded statuses from blockchain hosts. All communication between clients and payment proof servers must be encrypted with Nacl and Kyber.
Using cryptocurrency for funding content also allows for the timestamping of profiles/reports/messages. The blockchain becomes a source of truth for the earliest time at which a message can be proven to exist. If the sender-alleged message creation time conflicts with the payment proof funding time by more than an hour, a warning could be shown.
After a mate-profile/message/report is funded, its funded status is static, and its expiration time cannot be increased.
### Privacy Risks
Payment proof providers pose a necessary privacy risk. The servers must be trusted to not keep track or log which account funded each profile/identity/message/report. If the proof provider servers were compromised over a period of time, they could be used to log the profiles/identities/messages/reports funded by each account. This would negate the privacy advantages of secret inboxes, making it easier to tell which users are talking to each other.
If an attacker only obtained a snapshot of the servers, they would only learn the balances of each account.
In any server-compromise scenario, the message contents would still be encrypted.
#### Account Crypto Address Linking
Another privacy consideration is the ability to link a user's identity hash to their account crypto address(es).
If a user funds their moderator score and payment proof provider account with the same Ethereum/Cardano wallet, then linking these addresses together is trivial.
Another easy way to link identities to cryptocurrency addresses is to correlate the funding of payment proof provider addresses on the blockchain with the funding of new user identities/profiles on the Seekia network. This issue is mitigated by telling users in the GUI to wait a while after purchasing custodied cryptocurrency before broadcasting their profile for the first time. This breaks the link between their custodied cryptocurrency purchase and their identity/profile being funded.
Even if users are careful to prevent any links between their payment proof provider cryptocurrency transactions and their Seekia identity, observers may still be able to guess that the funds belong to some user of Seekia, because the addresses owned by payment proof providers will be easy to discover. Using blockchain analytics and user profile metadata, they could learn the wallet owner's real world identity.
An attacker could potentially determine which exact messages were sent by a user. If the attacker observed the amount of cryptocurrency that a particular cryptocurrency address sent to a payment proof provider, they could subtract each user's known identity/profile funding transaction amounts and determine which sent message/report costs add up exactly to the amount of funds spent. They could use information about the recipients of the messages to better guess that they had been sent by the suspected user. This would allow attackers to link Seekia users to their cryptocurrency addresses and messages.
If user identities are linked to account crypto addresses, users who send from crypto wallets with large amounts of money could have their crypto wallet balances revealed to the world. This could cause them to become victims of crime or be pursued by gold diggers. Users with large amounts of crypto should use privacy preserving technologies such as zero knowlege accumulators when increasing moderator identity scores or purchasing Seekia payment proof custodied cryptocurrency. This warning is shown within the GUI.
As more people use Seekia and the quantity of Seekia transactions increases, the anonymity set for each payment proof provider increases, and these privacy risks are reduced.
### Multiple Cryptocurrencies
Seekia is designed to support multiple cryptocurrencies.
Ethereum and Cardano are chosen as the first cryptocurrencies to power the Seekia network.
The implementing of a currency will undoubtedly encourage some users to buy into and adopt that currency.
We should only support cryptocurrencies that are well established, have a fair coin distribution, and have robust and actively maintained node implementation(s). We should avoid supporting currencies that are scammy and unprincipled.
We should try to support as few currencies as is necessary, because each supported currency is another moving part that the Seekia network must rely upon.
I want to avoid supporting proof of work currencies because they use more energy, rely on more hardware, have less consistent block times, and are arguably less resistant to attacks.
We should try to add currencies that have privacy tools built for them.
Future currencies/networks to support:
1. Ethereum Layer 2s/Sidechains
2. Cardano Layer 2s/Sidechains
### Why not Monero?
Monero is a very useful cryptocurrency, but it is not suitable for Seekia's use case of publicly destroying coins.
Outputs would have to be publicly burned, which would create many useless decoys for other transactions, reducing privacy for other Monero users.
Using Monero in this way would also reduce the privacy of Seekia users. Each burned output's input decoys could more easily be traced to a user's real world identity, aided by the user's profile metadata.
Linking two consecutively burned outputs together would also be quite easy due to the limited number of decoys. An example would be if a user funds their payment proof provider account after funding their moderator identity. This is obviously an even greater problem on transparent blockchains like Ethereum, but Monero has an expectation of being private which we do not want to degrade.
The blockchain servers would also have to parse all the outputs with a public view key, which would be slower.
### Coin Burn Address Type
Cryptocurrencies could implement a special address type for burning coins, so that Seekia transactions do not bloat the UTXO set/state which nodes have to keep track of, and so that wallets can warn users when burning funds.
## Profiles
There are three kinds of profiles: Mate, Host, and Moderator profiles.
### Disabled Profiles
To disable a profile, a user broadcasts a new profile with an attribute "Disabled" set to "Yes".
Profiles of all identity types can be disabled.
Disabled Mate/Host profiles will be retained by the network until the identity type's network profile inactivity duration has passed.
Moderator disabled profiles must be kept in the network forever, because moderator profiles never expire. This only applies to Moderators who have funded their identity.
## Genetics
Seekia offers the ability to analyze a user's genome, and share information about a user's genetics on their profile.
This allows for users to reduce the probability of their offspring having genetic diseases, and to increase the probability of their offspring having certain traits.
*TODO: Describe these features in more detail.*
## Ancestry
Seekia is an ancestry aware mate discovery network. Users can share the ancestral populations they are descended from and the haplogroups they belong to. Users can filter other users by their ancestry, and can view the calculated ancestry of their prospective offspring with each user. Filtering and sorting by ancestry will help users to find mates who belong to their desired race(s), because ancestry is correlated to race.
User profiles can include ancestral analyses from multiple providers and computational methods. The Seekia app is also planned to provide the ability to perform ancestral analyses from raw genome data files.
### Ancestral Similarity
Ancestral Similarity is a percentage value representing how closely related the ancestries of 2 users are.
It relies on the ancestral composition provided by companies such as 23andMe and AncestryDNA.
A different ancestral analysis method could be created that has many more human populations, but each population has no name. Each name would instead be a population identifier, which could be a 4 byte value.
## Race
Seekia allows users to filter mates by racial traits such as eye color, skin color, hair color, hair texture, and the alleles in their genome which effect these traits.
## Racial Similarity
Users are able to sort other users based on their racial similarity.
Seekia aims to help people find the most racially similar person, but one who is not similar enough that their offspring would have health issues. This is a tool that is useful in Seekia's goal to cure racial loneliness. Racial loneliness is the condition of being unable to find members of one's own race to mate with and befriend.
Racial similarity aims to help match people who look alike and have similar allele values in their genome which effect physical traits. These matches are more likely to breed children who look similar to them.
Racial similarity is calculated by comparing trait similarity, trait genetic similarity, ancestral similarity, and haplogroup similarity.
### Trait Similarity
To calculate trait similarity, each user's eye color, skin color, hair color, and hair texture are compared.
For example, if both users have blue eyes, their Eye Color similarity is 100%.
Facial similarity detection technology is another planned feature for Seekia. The Seekia app could compare user profile photos to help users to find potential mates whom have similar facial structures, helping to cure racial loneliness. Users could also import photos of people they are strongly attracted to for the purpose of finding a mate who looks similar to them.
### Trait Genetic Similarity
Each user can choose to share the genes which effect eye color, skin color, hair color, hair texture, and facial structure.
Seekia compares the percentage of these genes which are similar between two people to calculate genetic similarity for each trait.
## Chat
Seekia allows users to chat.
Messages can contain an image, text, an emoji, or a questionnaire response.
Messages are encrypted with Nacl and Kyber using the keys broadcasted in the recipient's profile.
Each message has an Inbox, which is used by the recipient when downloading their messages from hosts.
Only Mate/Moderator users can send messages. Hosts are excluded, because I see no need at this moment for hosts to communicate over the network. Hosts can always share some other communication method such as an email address on their profile.
Inter-IdentityType communication is forbidden. Mate users can only contact Mate users, and moderators can only contact other moderators. This rule exists to prevent Mate users from being manipulated by malicious Moderators.
### Funding Messages
Each message must be funded with cryptocurrency. The cost depends on the size and duration of the message.
Each message only has 2 options for duration: 2 days and 2 weeks. This makes all messages look more similar, reducing the possibility of linking a message fund duration to a particular sender.
Once a message is funded, its duration cannot be extended. This allows hosts and users to not have to download any more of a message's payment proofs or make any more queries to blockchain hosts after they have confirmed that a message has been funded and retrieved its expiration time.
### Message Encryption Keys
Each Mate and Moderator profile has 2 public encryption keys: Nacl and Kyber.
These are used to encrypt all messages sent to that identity.
They are amnesic, meaning new keys are periodically broadcast in a user's profile and old keys are eventually deleted.
The same chat key set is used for all of an identity's conversations when they are in use.
Old keys are deleted from the client machine after the client has received and decrypted the last messages still existing for a particular set of keys. The client will wait to make sure any messages that have been encrypted with those keys have propagated throughout the network and been downloaded.
A user's profile and sent messages contain a **ChatKeysLatestUpdateTime** attribute to alert users when they have updated their keys on their profile. Users will not send messages to another user unless they have downloaded their current active chat keys. A grace period exists which allows old keys to still be used even after new keys have been broadcast.
Chat keys are generated from scratch on a user's machine and must be exported from the app to be able to decrypt old conversations upon signing in on a new client. This is done so that if someone were to compromise a user's seed phrase, they could not decrypt any of their chat messages.
If a user's computer is compromised, any amnesic keys on the user's computer could be used to decrypt all messages sent to that user. Messages which were deleted along with their keys would still remain encrypted.
### Device Identifier
Upon startup, the Seekia application creates a random device seed.
Unique device identifiers are generated from this seed for a user's Mate/Host identities.
These identifiers are broadcasted on a user's profile and in their sent messages.
If Bob's device identifier changes, others know that they must discard Bob's old chat keys, and delete any secret inboxes they have saved for Bob. Chat keys and secret inboxes must be imported from an old device, and users will assume that Bob did not import them.
Device identifiers are also useful when a user restores their identity to a new device. The client will attempt to download any existing profiles authored by the user's identity from the network. The client compares the downloaded profile's device identifier with the client's device identifier, and is able to determine that the user's existing profile was created on a different device. This information is shown to the user when the client offers to import information from the profile.
### Public Inbox
Each message is sent to an inbox. Users download messages by querying hosts for messages sent to their inboxes.
All users have a public inbox, which is a hash of their identity hash. Everyone can see how many messages each user has received in their public inbox. Public inboxes are a tradeoff which increase the speed of downloading messages but are detrimental to privacy. A way to avoid having public inboxes is to require users to download many messages that were not sent to them in order to scan and determine which messages were sent to them, similar to how stealth addresses in blockchains work (see Monero view tags).
It is possible for anyone to see statistics about message recipients. An example would be a chart that shows Wealth on the X axis and Number Of Public Inbox Messages on the Y axis. This could be built into the Seekia app on the Message Statistics page.
A way to reduce this privacy flaw is to set up services which send many fake messages to inboxes with fewer messages. This would equalize the number of messages in all inboxes, making it much more difficult to tell who has received more messages. These messages need to be crafted and broadcast in a way that looks fully authentic. It may not be worth it, because it would increase the number of messages on the Seekia network substantially. A better option is for the service to not try to make all inbox quantities equal, but rather equalize only the lower tail end of public inbox quantities.
### Secret Inboxes
If users only had public inboxes, it would be possible to analyze the message sent times and other metadata about users to determine which users were communicating. For example, if 2 inboxes had increased activity around the same time periods, and their owners lived near each other, it would be possible to guess that those 2 users were communicating. This kind of analysis becomes more difficult as more users join Seekia.
To increase the privacy of chat messages, Seekia uses secret inboxes. Each message contains a Current and Next secret inbox seed. These are used to generate the inboxes that the recipient should send future messages to. A message sender's Current and Next secret inboxes should be sent to during the Current and Next secret inbox epochs. Each secret inbox is unique to each conversation recipient.
The secret inbox epoch is a time period that is defined by the secret inbox epoch duration, a variable provided within the network parameters. Each epoch start and end time is agreed upon by all users of Seekia.
Using secret inbox epochs allows for all secret inbox conversation pairs to change across the network at the same time. This facilitates a mixing effect and improves privacy. Without a global secret inbox epoch, a secret inbox's true recipient could be revealed by analyzing when a secret inbox stops receiving messages and a public inbox starts receiving messages.
A sender should not stop sending to a recipient's secret inbox until the epoch which the secret inbox belongs to has passed. If a secret inbox epoch is 3 days long, the recipient will send to the sender's 2 secret inboxes for a minimum of 3 days, or a maximum of 6 days. This depends on if the most recent message was sent towards the beginning or the end of the current secret inbox epoch.
Bob knows to send to Alice's current and next secret inboxes whenever Alice sends a new message. Only 2 secret inboxes are needed, because if Alice has not responded to Bob's message during these 2 epochs, it is safe for Bob to send messages to Alice's public inbox. The risk of timing analysis attacks becomes greatly reduced. A shorter secret inbox epoch reduces the length of time that Alice's client has to check for new messages from her secret inboxes.
As more users join Seekia, the secret inbox epoch duration can be reduced, as timing attacks will become increasingly difficult.
### Message Encryption Scheme
Messages are encrypted using Nacl and Kyber.
The ChaCha20Poly1305 (ChaPoly) cipher is used.
Two randomly generated 32 Byte keys are created.
These keys are XORed to derive a Basaldata decryption key.
The Basaldata decryption key is hashed to create a Message Cipher Key.
Both keys are encrypted: One with Nacl, and the other with Kyber.
The basaldata decryption key cannot be derived from the message cipher key. This is done because users share the message cipher key when reporting a message. Reporters are able to reveal the contents of the message communication and the message sender without revealing the Basaldata, which contains the message recipient (typically the person making the report) and other sensitive information.
In cases where the recipient of the message received the message to their secret inbox, revealing the cipher key would not reveal the message recipient's identity hash. For messages sent to the recipient's public inbox, the basaldata only prevents some of the sender's metadata from being revealed.
#### SealedKeys Encryption
The Nacl and Kyber encrypted key pieces are known as the SealedKeys.
The sealed keys are encrypted by a ChaPoly cipher using a SealedKeysSealerKey as the key.
For messages sent to a user's public inbox, the SealedKeysSealerKey is a hash of the recipient's identity hash.
For messages sent to a user's secret inboxes, the SealedKeysSealerKey is derived from the SecretInboxSeed.
The sealer key is used because revealing the SealedKeys increases the cryptographic attack surface. It is easier to determine who the messages were sent to if we have their public Nacl/Kyber keys, and the keys encrypted with those public keys. This may already be possible, or may become possible by some future cryptoanalytic breakthrough.
The SealedKeysSealerKey only increases privacy for messages sent to a recipient's secret inboxes. The encryption of the SealedKeys is not needed for public inboxes, because the recipient's identity is already knowable by their public inbox. It is done anyway to make all messages look more similar. An observer would not know that the message was sent to a public inbox unless they had the recipient's identity hash, which is needed to derive their public inbox.
### Message Cipher Key Hash
All messages include a Message Cipher Key Hash, which is a hash of the message cipher key.
This is useful as a method for moderators to prove that they have seen the contents of the message.
Moderators share the message's cipher key in their reviews, and if the cipher key hashes to the message's cipher key hash, we know that the moderator has seen the contents of the message (or the message is malformed and the moderator is malicious).
Message cipher key hashes are stored in the database as metadata, so hosts and moderators can delete messages, keep their metadata, and still be able to verify that review authors have actually seen the contents of the message. This saves space by not requiring messages to be stored while still being able to verify message reviews.
### Reporting Messages
If a user wants to report messages, they will do so publicly. Reports are created anonymously. Each reported message's recipient is knowable if the message was sent to their public inbox, and is hidden if the message was sent to their secret inbox.
A message report contains the message hash and message cipher key. The cipher key is used by the moderators to decrypt the message.
The moderators provide the message cipher key with their reviews, which acts as a proof that the moderator has seen the message, and another source for moderators to retrieve the message's cipher key. Reviews provide a way for moderators to get the message's cipher key after the original message report has expired from the network.
### Downloading Messages
Users download their messages by downloading the messages within their public inboxes and active secret inboxes.
The client keeps a list of active secret inboxes, and checks them one-by-one from different hosts over new Tor circuits to prevent hosts from linking the inboxes together.
If a user contacts 100 users during a particular secret inbox epoch, this will add 100 secret inboxes to check for the current secret inbox epoch, and 100 to check for the next epoch.
Once a client has synced up from a certain unix time, it stops downloading messages from old secret inboxes. After an inbox expires, no new messages will be sent to it. After a secret inbox has expired and the user has checked it sufficiently for new messages, the user's client deletes the secret inbox.
This system may be too slow to download messages one-by-one if a user has hundreds of inboxes. An easy solution is to have trusted hosts, which would be listed in the parameters, who promise to not log requests. Users can request to download messages from all of their inboxes in the same request to these hosts, drastically increasing the speed at which they receive their messages. Another option is to download message inboxes 2-at-a-time, which only slightly reduces privacy. The same inbox pair should be provided in each request, to prevent hosts from learning more inboxes by linking the same inboxes from different pairs together.
It is important to note that sent messages will not appear as quickly in the recipient's client as they do for centralized messaging providers. A broadcasted message must propagate throughout the network hosts before being downloaded by the recipient, which increases latency.
### Using Different Devices
Users can only use Seekia one device at a time. Upon signing in to a new device, a user's profile broadcasts a device identifier, letting others know that they should discard their secret inboxes and chat keys, and download their new profile.
To transfer all conversation history and user data to a new device, a user can export and import their Seekia data.
Implementing a multi-device scheme would add a lot of complexity. Users would have to download the messages they sent from their other devices, which would be encrypted with keys that all of their devices had downloaded. Some form of cloud storage would be necessary, which could be updated whenever the user performed actions such as adding a user to their contacts or likes. Observers could identify when the user updated their encrypted cloud container and learn when they are online and potentially what they are doing. Adding these features is not worth the hassle.
Users who want to use multiple devices simultaneously could instead remotely access the device that their Seekia identity is signed in to. This feature should be built into the application, so the usage experience would be near-identical. They would have to leave their main device running.
### Resisting Network Analysis
All conversations are initiated to a user public inboxes.
Responses that are sent within the message creation time's current or next secret inbox epoch will be sent to user secret inboxes.
If a secret inbox received a new message immediately after Bob's public inbox receives one, an observer could guess that that message was a response from Bob. This becomes much more difficult to guess as more users join Seekia.
An observer could match up pairs of secret inboxes by correlating message sent times. The observer should still be unable to tell who is using each secret inbox.
Secret inboxes are retired for both conversation parties at the end of the epoch, which gives observers less time to correlate two secret inboxes together.
All secret inboxes are updated at the same time across all users, which makes it harder to link a previous secret inbox pair to a second secret inbox pair.
If all users refreshed their secret inboxes independently from the rest of the network, it would be easier to link a newly created secret inbox to an existing secret inbox through frequency analysis. An observer could see when one secret inbox stops receiving messages, and the next one starts receiving messages, and guess that those inboxes were owned by the same user.
Message metadata can greatly reduce privacy. The more unique a user's messaging patterns are, the easier it is to identify the messages they send. For example, if a user sends 14 image messages to many different users in quick succession, each group of 14 images could be identified and linked to the same sender.
As the number of Seekia users increases, the anonymity set grows larger, reducing the ability for observers to analyze the network.
## Moderation
Seekia has an open and decentralized moderation system.
Moderators create identity, profile, attribute, and message reviews.
Each identity, profile, and message has a **Verdict**. A verdict is calculated by collecting and analyzing all of the reviews for an identity, profile, or message. Users can be Banned or Not Banned. Messages/Profiles can be Approved or Banned.
Anyone can participate as a moderator. They must first fund their identity with cryptocurrency.
Anyone can report users, profiles, attributes, and messages. Reports are intended to only be created by non-moderators. Moderators should instead create a ban review to achieve the same effect. Reporting a message involves revealing the decryption key so moderators can view the message.
### Identity Score
Moderators must have a sufficient identity score to participate.
An identity score is sum of the gold value at time sent of cryptocurrency deposits sent to a moderator's identity score crypto addresses.
A moderator's identity score crypto addresses are derived from their identity hash. Any funds sent to them are destroyed forever.
There are two gold values defined in the network parameters: The minimum amount required to be a moderator, and the minimum amount required for the moderator to be able to ban other moderators.
Moderators can always send more money to their addresses to increase their score, and can do the same for other moderators whom they trust.
All moderators can be sorted based on their identity scores. A moderator's rank is defined by this list.
Moderators can ban moderators who are below them in rank.
The identity score exists as a bad behavior deterrent. If a moderator begins banning ruleful content and moderators, a higher ranked moderator can ban them. Thus, it costs money to be a malicious moderator. Once a moderator's identity is banned, they can try to convince the moderator(s) to un-ban them, try to convince higher ranked moderators to ban the moderator(s) who banned them, or spend more money to outrank and ban the moderator(s) who banned them. This system encourages cooperation, because moderators will not want to play leapfrog by spending more and more money to ban each other.
If there are enough good-natured moderators, the system should be able to root out bad moderators while still being decentralized.
A moderator's score also determines how many reviews they can upload to the network. This limit exists to prevent malicious moderators from spamming the network with reviews.
### Supermoderators
Supermoderators are moderators whom can ban all moderators below them in rank, without necessarily having a higher identity score. They are given this power by the admin(s), and the list of supermoderators is provided within the network parameters. Supermoderators are described in a ranked order, and can ban each other.
Supermoderators should ideally only need to use their power in extreme circumstances. Supermoderators are a tool to salvage the moderation system if many malicious moderators join and become highly ranked.
Without supermoderators, if a single moderator spend enough funds to become the highest ranked moderator, they could ban all other moderators and cripple the network. This malicious moderator could only be defeated by at least 1 good moderator who burns more money than the malicious one to be able to ban them. With supermoderators, a supermoderator can ban this malicious moderator to rescue the network from the control of the malicious moderator, without having to spend any money.
Supermoderators currently only have the absolute authority to ban other moderators. Their profile/message/attribute reviews are counted the same as other moderators. This is designed this way because it simplifies the protocol and code, and increases decentralization by reducing the power of supermoderators. If a profile is wrongfully approved/banned, banning the moderators who created those reviews will undo the wrongful verdicts.
### Verdicts
There are two kinds of verdicts. Each review contains a verdict, and each identity/profile/message has a network consensus verdict.
*TODO: Change the name of Verdict to something else for reviews. Perhaps "Judgement"?*
Below are the types of review verdicts:
* Identity: Ban/None
* Profile: Approve/Ban/None
* Attribute: Approve/Ban/None
* Message: Approve/Ban/None
Below are the types of network consensus verdicts:
* Identity: Banned/Not Banned
* Profile: Approved/Banned/Undecided
* Message: Approved/Banned/Undecided
User identities cannot be approved, they can only be banned. This is because a user can always change their behavior to become unruleful. Only profiles and messages can be approved, because their content is static.
The process to calculate the verdict of an identity/profile/message is complex. It involves using the identity scores of moderators to weight their verdicts.
See `internal/moderation/verifiedVerdict/verifiedVerdict.go` to see how verdicts are calculated.
### Viewable Statuses
Each identity/profile/message has a Viewable status.
A viewable profile/identity/message is one that can be displayed to users. Hosts can choose not to host unviewable profiles and messages if they want to avoid hosting unruleful and illegal content. Unviewable profiles and messages should eventually be deleted from the network.
When downloading Mate profiles to browse, users use the `GetViewableProfilesOnly` parameter to only download viewable profiles. Mate users also download the viewable statuses for identities and profiles they have downloaded. The GUI will only show matches whom are viewable, and will warn users when trying to view unviewable profiles.
#### Sticky Viewable Status
Calculating viewable statuses requires first calculating sticky viewable statuses.
Sticky viewable statuses are a kind of consensus status that requires a verdict to be present for a minimum defined period of time. To calculate a viewable status, its verdict history is needed.
Sticky statuses are needed to defend against malicious moderators.
Imagine this scenario: A malicious moderator bans all other moderators and all content on the network. All identities, profiles, and messages on the network are now **Banned** (except for the malicious moderator). Other moderators need to ban this moderator to undo the damage.
Without sticky consensus, all the hosts would treat all network profiles as being banned, and would stop seeding these profiles to users. This single malicious moderator could cripple the network for as long as it would take to ban that moderator. Banning this moderator could take hours, and is more difficult the more highly ranked they are.
Sticky statuses attempt to solve this problem.
With sticky consensus, as long as a profile/message/identity has been viewable for a certain period of time, its sticky viewable status becomes stuck.
For the sticky status to be switched to Unviewable, the profile/message/identity's status would have to be Unviewable for a certain period of time.
Hosts will serve content to users based on each identity/profile/message's sticky viewable status, not its real-time consensus verdict.
#### Sticky Status Establishing Time
Hosts must be online for long enough to determine, or establish, the sticky status for content within their ranges. Each sticky status can only be considered established if the user's client has been downloading the content's reviews for long enough.
This establishing time is needed for several reasons:
1. When adding a new range, the host needs time to initially download the reviews for content within the range.
2. Hosts may initially get an inaccurate view of the sticky status due to malicious hosts or hosts that are not caught up with the rest of the network
3. The status may have only recently been flipped by many malicious moderators, and will be flipped back to the "true" status after those malicious moderators are banned. Without waiting, the host would only see a small portion of the true verdict history.
* For example, During the last 5 minutes, a profile was viewable 100% of the time
* But within the last 50 minutes, it has been viewable for only 10% of the time
Hosts will only share an identity/profile/message's viewable status to requesting peers once the status is established.
#### Sticky Viewable Status versus Viewable Status
Each identity/profile/message has a viewable status and a sticky viewable status.
Throughout the code, you will see **Viewable Status** and **Sticky Status** to describe each.
An Identity's sticky viewable status is always identical to its viewable status.
For Messages/Profiles, sticky viewable statuses and viewable statuses are identical, except that viewable statuses take into account the sticky viewable status of their author.
For example, let's say a profile is approved, but was created by an author who is banned.
* The profile's viewable status is False.
* The profile's sticky viewable status is True.
*This is rather confusing. Maybe use Seeable and Unseeable for sticky statuses.*
#### Calculating Sticky Viewable Status
Calculating a sticky status involves checking what the real-time consensus verdict was for the identity/profile/message for the past period, and calculating what percentage of those verdicts were viewable.
Below describes the verdicts that define whether an identity, profile, or message is viewable:
Type | Viewable | Unviewable
--- | --- | ---
**Identity** | Not Banned | Banned
**Mate Profile** | Approved | Undecided/Banned
**Host/Moderator Profile** | Approved/Undecided | Banned
**Message** | Approved/Undecided | Banned
To calculate a sticky viewable status, there exist 3 parameters that are provided in the network parameters.
Identities, Profiles, and Messages each have their own 3 variables for calculating sticky status:
1. StatusEstablishingTime
* This describes, in seconds, the amount of time that a host must be downloading the reviews for an identity/profile/message to be able to determine its sticky status.
2. VerdictExpirationTime
* This describes, in seconds, the amount of time that consensus verdicts should be included in a sticky status's verdict history.
* For example, if the verdict expiration time is 10000, then any verdicts that occurred more than 10000 seconds ago will not be included in the calculation.
3. MinimumViewablePercentage
* The minimum percentage of verdicts that must be Viewable for the sticky status to be Viewable
* For example, if the value is set to 60% for identities, then an identity's verdict must be **Not Banned** for at least 60% of all verdicts younger than the VerdictExpirationTime
See `/internal/moderation/verifiedStickyStatus/verifiedStickyStatus.go` for the full implementation.
### Trusted Viewable Statuses
Normal users download viewable statues from hosts for content they download.
Users must download the viewable statuses from multiple hosts to be sure that the statuses are accurate. This requirement makes it unlikely that a user's client will mistakenly believe an unviewable profile is viewable.
### Banned Message Authors
The Seekia application will download the viewable status of all message authors for a user. If a message's author is banned, the application will hide messages sent by that user, unless the user decides to show messages from banned users. This will be very useful if identities are created which spam the network with advertisements and junk. The Seekia application does not have a spam message filter, so banning these users will hide their messages for all users.
Users may be wrongfully banned, or users may still want to communicate with the person, even if they were banned from the network. Assuming the user is not unbanned, their profile will disappear from the network, but they should still be able to chat with users who choose to show Banned users in their conversations.
The banned user's chat keys should still work, allowing the user to migrate to different communication channels.
One interesting possible feature is to ban a user's public inbox after they are banned. This would not allow messages to be sent to banned user public inboxes anymore. If this was implemented, a grace period where the user's inbox would be allowed to exist would be necessary. This grace period would prevent a user's inbox from being deleted if they are wrongfully banned and then unbanned shortly after. Secret inboxes would be immune from inbox bans, so the banned user would still be able to chat by initiating conversations with users. Users would not be able to initiate conversations with banned users, but they would be able to respond and continue the conversation. The Seekia app would also have to warn users that their messages will not be delivered if the user's client believes that the message's recipient is banned, and the message is being sent to a public inbox.
### Profile Attribute Reviews
When reviewing profiles, moderators can submit Profile reviews or Attribute reviews.
An Attribute review is a review of a specific attribute within a profile.
Attribute reviews have several advantages:
1. A moderator can specify the attribute that caused them to ban a profile.
* Example: Ban profile because Description was unruleful.
2. Moderators do not have to approve all attributes of a profile again if the user resubmits their profile with 1 attribute changed.
* The moderators only have to approve the single changed attribute, because the moderators could have already approved all of the profile's other attributes.
3. Moderators can choose the kinds of attributes they want to review.
* A moderator can choose to only review images, and they can still contribute to the network and reduce the amount of work other moderators have to do.
4. Moderators can review 1 kind of attribute at a time.
* Within the GUI, moderators choose the attribute they want to review, and cycle through the attribute value for each user profile. This reduces the cognitive load of context switching and increases moderator efficiency.
Full profile reviews still exist because moderators sometimes have to ban profiles which have no specific unruleful attribute, but are still unruleful. For example, if a profile's photo is of a young man but their age is 100. In that case, if a moderator banned the Photos attribute, it would be unclear for other moderators why the photo was banned if the photo itself was ruleful. The moderators would have to read the moderator's reason for banning the photo to understand. The GUI should prominently show how many moderators have banned/approved an attribute, and the reasons should be easily accessible to avoid misunderstandings.
### Undoing Reviews
There is a third type of verdict called a **None** verdict. These verdicts are used to undo previous verdicts. A review with a newer CreationTime is created for the same reviewedHash, and the old review is eventually discarded. The old review may be kept on the network if the review was used as a reason for banning the moderator in any identity ban reviews.
Attribute reviews add complexity to how a profile's reviews are undone. If a moderator bans an attribute, and later approves the full profile, then the attribute review is disregarded. If a moderator approves a full profile, and later bans an attribute from that profile, the full profile approval is disregarded.
### Content And Review Pruning
Once content becomes banned, it is still hosted by the network for some time. This is because if it was wrongfully banned, ruleful moderators need to be able to see what the content contained to determine if those who banned it were wrong for doing so.
Once content and/or its author have been banned for long enough, hosts will delete the content. They will maintain its metadata so they can continue to verify reviews and be aware if the content is within their range (see `contentMetadata.go`)
Reviews of banned identities/content will be kept until the identity/content expires from the network. It may be possible to delete reviews for content before it expires from the network if the content's author has been banned, but this should only be done if the author has been banned by a substantial enough number of moderators. The application should also wait a certain amount of time, because we don't want to lose the historical reviews from wrongfully banned moderators if a malicious moderator bans all moderators.
Each moderator's Seekia application should keep all of the locally authored and broadcasted reviews until the reviewed identity/content has expired. This way, if the reviews are dropped from the network prematurely, the moderator's client will be able to rebroadcast those reviews.
### Content Controversy
Each piece of content has a **Controversy** rating. Controversial content is content which has a large amount of disagreement around its verdict.
This can be used by other moderators to find and root out bad moderators. Controversial content may also be used to foster conversations between moderators to define the rules of the network.
### Moderator Controversy
Each moderator has a **Controversy** rating. Controversial moderators are moderators who disagree with other moderators the most often. Users can view and sort moderators by their controversy to aid in rooting out unruleful moderators.
### Automatic Banning
Moderator clients will automatically ban other moderators who create invalid reviews. These reviews cannot be created by the Seekia application, so any such reviews must have been authored by a custom piece of software.
Moderators could also automatically ban malicious hosts. Malicious hosts are hosts who provide invalid data in their network responses. The moderator would be completely sure if a host was malicious, because they would have downloaded the malicious response themselves, and the response must have been signed by the host's identity key.
## Conclusion
Thank you for reading the Seekia documentation.
Much remains to be described in this document. Read the code to fully understand the innerworkings of Seekia.