Genomics
Personal genome sequencing has the potential to enable better disease prevention, more accurate diagnoses, and personalized therapies. Furthermore, sharing genomics with researchers promises identification of the causes of many diseases (Example: Alzimer diseases, brain cancer, Covid-19) and the development of new therapies. These efforts will result in biomedical datasets increasing at an unprecedented scale, that will enable researchers to push the frontiers of genomic medicine. As the scale of genomic and health-related data explodes and our understanding of these data matures, the privacy of the individuals behind the data is increasingly at stake. Traditional approaches to protect privacy have fundamental limitations. Here we can apply blockchain enabled federated learning as an emerging privacy-enhancing technologies allowing for broader data sharing, and collaboration in advancing genomics research.
Key Challenges
Regulation
Restrictive government regulatory complexities that hinder data sharing for genomic data. Sharing genetic and genomic data is vital for medical research and care, since it increasingly underpins medical diagnosis, treatment and management. Excessive or unclear regulation could create unnecessary barriers and impede vital science and healthcare. On the other hand, ineffective, insufficient regulation could reduce public trust, breach privacy and cause societal harm.
Data Storage
As the increasing number of genome sequences become available and the cost of sequencing continues to decline, the problem of data growth through acquisition & analysis will increase to a point where storage of data repositories becomes economically impractical. In addition, to sequence a whole human genome, it would require upto 200GB of storage space. This also impedes data transfer over the network that will incur huge costs
Privacy
Human DNA contains extremely sensitive and private information. They are usually stored centrally through a 3rd party cloud provider that poses high security risks. Problems not only arise from data security but also confidentiality and the data management practices of organizations having access to large genomics datasets that will surely impact privacy
Privacy
Human DNA contains extremely sensitive and private information. They are usually stored centrally through a 3rd party cloud provider that poses high security risks. Problems not only arise from data security but also confidentiality and the data management practices of organizations having access to large genomics datasets that will surely impact privacy
Scarcity
Scarcity of genomic data is exacerbated by difficulty in data access due to fragmentation of genomic data across proprietary data silos embedded within various organizations. This would hinder the acceleration of precision medicine
Collaboration
Data handling, interpretation & consent – majority of healthcare organizations are unprepared to deal with emerging data sources or access to high volume of data for collaborative research. Challenges range from data quality, data sharing, liability and the complexities of privacy laws that are required for compliance.
Solution overview
The emerging genomics industry framework, enabling new business models, integrating blockchain and federated learning on the Terracuda.Ai platform
Client
Gene Application Platform
Sequencing laboratory
Sequencing laboratory
Sequencing laboratory
Model
Model
Model
Terracuda.Ai Platform
Federated Learning
IPFS decentralized storage
Blockchain
Data scientists
Biologists
Pharmaceutical
Gene platform hosted by sequencing laboratories onboard clients (provide genome sample) in the form of smart contracts with access privileges digitally signed on the blockchain. Cost of whole-genome sequencing is now at an all time low, which means the rapid growth in genome services require the highest level of privacy, including its expanding data storage solutions
Sequencing labs can form a consortium with access to permissioned members on the Terracuda.Ai platform, enabling them to interact with the rest of the network. Thereby exploiting the federated learning framework by providing algorithms through a distributed computation on the platform, with global model updates, that is securely managed, orchestrated and traced on the blockchain. Without the need to move data
New business models will emerge, where sequencing laboratories, can provide a safe, secure privacy preserving model through a revenue share framework, collaborating with the healthcare community ( data scientists, biologists, pharmaceuticals) in pinpointing accuracy prediction. Allowing to optimize and accelerate the drug discovery process for precision medicine, without revealing the highly valuable in-house genome data
Majority of the worlds genomic medicine projects are being conducted in isolation. Blockchain enabled federated learning will enable research collaboration accelerating innovation and breakthroughs in genomic science and medicine