Four broad segments of Life Sciences (Genomics, Clinical Sciences, Pharmaceutical, Proteomics) generate a lot of data which is difficult to manage. Advanced analytics techniques are used to improve outcomes in areas like drug discovery, disease understanding, patient engagement, personalized medicine, product design, etc. However, data integration and processing challenges, listed below, inhibit HCLS companies from realizing full value potential of digital technologies.
Amorphic is a production ready data lake as a service platform that provides a self-service tool for data ingestion, preparation, transformation, ML and customized dashboards, and workload prototyping using AWS and 3rd party analytics services and tools. Some salient points of Amorphic solution:
Next sections highlight the simplification of a few HCLS use cases with a faster time to insight enabled by Amorphic.
Amorphic platform can provision genomic data pipelines using a single interface. Amorphic jobs can configure workflows for genomic data that can scale and run in parallel for cost efficient data processing. Amorphic datasets can directly connect to sequencing platforms for seamless ingestion to S3, Redshift, and Athena. It can scale genomic pipelines on demand with just a few clicks. Integration with Notebooks and Dashboards can provide customizable collaborative workspaces for Scientists, Computational biologists, and bioinformaticians. Amorphic integrates Genomic workflow tools such as GATK and BLAST provided as job templates for customized genomic data analysis pipelines. Auto scaling and serverless architecture can help save resource and engineering costs.
The Amorphic platform makes it easier to work with large clinical datasets with automated ingestion, transformation and analytics that can support a variety of clinical data workloads. Amorphic comes with pre-built use cases for handling electronic health records (EHR), medical images, clinical trials, document analytics, wearables, and medical claims data sets. Ability to automate ingestion, transformation, and management of data makes it easy for healthcare organizations to derive value out of the data using analytics and machine learning. Amorphic handles both structured and unstructured data with integrated profiling and ML services for datasets stored on AWS. Provisioning Analytics dashboards and machine learning models on top of these datasets is seamless with integration with Tableau, Power BI, Spotfire, Sagemaker notebooks, and Quicksight. Serverless architecture can provide on demand scale-up and scale-down of workloads with ability to handle petabyte (population scale) datasets.
Amorphic easily scales and speeds up drug discovery and cheminformatics workloads. Amorphic platform for cheminformatics comes pre-configured with public drug and molecule datasets (ChEMBL, pubchem, and Drugbank). We can add more public databases on demand. Drag and drop low code ETL allows easy cleaning, filtering, transformation of molecular datasets and build of knowledge graphs, interactive dashboards, and collaborative notebooks for analytics. Amorphic connections can ingest experimental data through database API, REST API, and file systems. S3 storage supports unstructured data like images, S3 Athena profiling allows working directly on structured chemical formats not supported in traditional databases. Amorphic can build workflows that support topology, information retrieval, and data mining. Integration with ML and AI (third party and Amazon) allows advanced analytics on Amorphic chemical datasets with just a few clicks. Serverless architecture can provide a cost efficient way to scale up and handle large datasets.
The Amorphic platform can easily scale and process large proteomics datasets with easy integration with experimental data, automated data management, drag-and-drop ETL, and provisioning of Analytics workspaces with just a few clicks. Amorphic can provide instant access to HPC resources on AWS to handle structural biology workloads and cost optimize and speed up research and development workloads. Amorphic datasets can support Biomarker and assay datasets using both S3 and Athena to provide for search and query capabilities. Integrating multiple datasets like biomarkers, organisms and tissue can produce information frameworks for easy research and development. Collaborative workspaces allow data scientists, researchers, and bioinformaticians to work on top of datasets and increase throughput and productivity. End to End integration allows complete visibility from raw data to finished datasets and derives more value from proteomics research.