From 54f25588573afab7a99425322a96158173b6fd98 Mon Sep 17 00:00:00 2001
From: Simon Sarasova <simonsarasova>
Date: Wed, 4 Sep 2024 16:02:05 +0000
Subject: [PATCH] Improved Future-Plans.md.

---
 Changelog.md                  | 1 +
 Contributors.md               | 2 +-
 documentation/Future-Plans.md | 2 ++
 3 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/Changelog.md b/Changelog.md
index e71be33..79a630c 100644
--- a/Changelog.md
+++ b/Changelog.md
@@ -6,6 +6,7 @@ Small and insignificant changes may not be included in this log.
 
 ## Unversioned Changes
 
+* Improved Future-Plans.md. - *Simon Sarasova*
 * Upgraded go-chart to version 2.1.2. - *Simon Sarasova*
 * Upgraded Badger to version 4.3.0. - *Simon Sarasova* 
 * Upgraded Whitepaper.pdf to Version 9. - *Simon Sarasova*
diff --git a/Contributors.md b/Contributors.md
index 103d953..df889f5 100644
--- a/Contributors.md
+++ b/Contributors.md
@@ -9,4 +9,4 @@ Many other people have written code for modules which are imported by Seekia. Th
 
 Name | Date Of First Commit | Number Of Commits
 --- | --- | ---
-Simon Sarasova | June 13, 2023 | 305
\ No newline at end of file
+Simon Sarasova | June 13, 2023 | 306
\ No newline at end of file
diff --git a/documentation/Future-Plans.md b/documentation/Future-Plans.md
index 75c5348..2455694 100644
--- a/documentation/Future-Plans.md
+++ b/documentation/Future-Plans.md
@@ -344,6 +344,8 @@ There are many benefits to having widely available open source genetic predictio
 
 All of this is already possible, but will become easier with the proliferation of more advanced open source genetic prediction models. Many advanced genetic prediction methods already exist, but many of them are closed source. Even using a public model trained on closed data would not be sufficient for Seekia's use case, because we need users to be confident that the models are reproducible and created from accurate data. We want the genetic future of the human species to be steered by open source technology.
 
+We can create training data by taking freely available genetic biobanks which lack phenotype data and using freely available prediction scoring methods to infer the likelihood of each phenotype for each genome. See [pgscatalog.org](https://www.pgscatalog.org/) for polygenic scoring methods. This strategy would be inferior to having the original biobank data because we would be trusting the people who provided the polygenic scoring methods, and predictions would be less accurate because we would be training on a prediction of a prediction. This strategy will likely be necessary because it will allow us to circumvent gaining access to biobank data.
+
 ### Add more diseases and traits
 
 Adding monogenic diseases entails entering disease SNP data from SNPedia.com and other sources. The bases have to be flipped if the orientation on SNPedia is minus. This requires flipping G/C and A/T. At least 3 people should check any added disease SNPs to ensure accuracy.