Néstor Vázquez Bernat, Head of Application Science at ENPICOM briefly introduced his company: ENPICOM serves ML scientists by focusing on developing software for biologics discovery. The team has expertise in immunology, computer science, machine learning, and software development. The main aim is to empower scientists and researchers by giving them the autonomy to visualise their data and select the best leads in an intuitive environment. Secondly, ENPCIOM’s platform maximises the ROI of AI and frees time for computational biologists and data analysts.
The general consensus among both AI users and non-users is that it will transform research on a daily basis. Over the last five years, there has been a 17% yearly increase in patents and 34% in publications. Vázquez Bernat explained that while there is excitement and interest around AI, the level of adoption is not that high. The principal barriers to adoption are the need for FAIR data, attracting and retaining expertise, and having the right tools to track, compare, and deploy models.
As a result, ENPICOM has created a highly scalable data infrastructure that facilitates data ingestion, storage, and access. The team has come up with a three-step blueprint for moving from current non-AI discovery to the successful integration of the discovery: data foundation, pipeline automation and consolidation, and adoption.
Focusing on a solid data foundation enables you to have all the data available for training and deploying the models. Step two, pipeline automation and consolidation are key because non-consolidated data will be different, non-homogenised, and full of disparities while non-automated data can create silos. Regarding AI adoption, this includes creating new models and managing the model life cycle so that they can be deployed back in discovery. Vázquez Bernat stressed the importance of testing models as early as possible in a real campaign because some models that appear faultless on paper fail in practice.
Moving on to the advantages of the platform, ENPICOM’s data infrastructure can easily be integrated with other systems in the clients’ organisations, meaning that data consolidation and automation will be very straightforward. On top of this, the infrastructure can house billions of clones, despite the massive data volume sequencing performance never dropped, demonstrating its efficiency and efficacy.
Additionally, ENPICOM has come up with various tools for gene annotation and quality control. For example, the sequence annotation tool can process any data size and works ten times faster than other tools in the industry. Yet, most importantly, it carries lower computational costs.
Vázquez Bernat presented an example of how a model can be tested in a real scenario and how one can tweak it to enhance its capabilities. Using AppMap, they chose to train the model on the binding properties of antibodies. After training the model against new data, the team used different modalities and parameters. Then they selected the best one and deployed it. The process was successful, and the model was deployed quickly.