The amount of data from global biomedical research is massive and growing fast. Big Data should probably be re-named “Bigger Data.” The National Center for Biotechnology Information reported recently that the increased data from genomics sequencing research alone, which doubles every seven months, is equivalent to or greater than data generated by astronomy, YouTube, and Twitter. New research starts with surveying historical studies, which becomes ever-more daunting when the data is in firestorm growth mode. Privacy concerns with biomedical data compound the issue.
Google Cloud‘s healthcare and biomedical data team recently announced a partnership with the National Institutes of Health to build a global biomedical data ecosystem. Google will work with the STRIDES (Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability) initiative to simplify access by appropriate parties to select, NIH-funded datasets. The datasets will be protected with researcher authentication and authorization mechanisms integrated with Google Cloud credentials. To ensure compliance with industry standards for data access, discovery, and cloud computation, Google is working with the Global Alliance for Genomics & Health and the BioCompute Consortium. Google also brings its open source tools to the Big Data table. These tools, which the company continues to develop, will be used to structure and integrate biomedical datasets.
Google’s participation in biomedical data storage, protection, management, and access enhances the data with storage capacity, resources, networks, and tools. The advantages of Google Cloud’s involvement for researchers are readily apparent, but the concept of a single, private company controlling the world’s supply of any resource may raise concerns in some circles. There is knowledge buried within data, however, and increased access to publicly-funded research can lead to wide-ranging advances in medicine and healthcare.