Optimizing data management to streamline antimalarial research

2020
Dominique Besson

Dominique Besson tells us more about the processes that MMV has put in place to manage the huge quantities of data generated by its partners

 MMV’s Discovery team is routinely working on more than 25 research projects, supported by 25 technology platforms. These platforms, which are located all over the world, test very large numbers of compounds to evaluate their activity against different stages of the parasite life cycle. Central to the success of this large-scale research effort is a streamlined system of storing, tracking and dispatching compounds to MMV’s partners, which enables rapid delivery of compounds throughout the world to accelerate drug discovery initiatives. The workflow also generates large volumes of data that must be deposited in a standardized format, stored securely and easily accessed by the relevant teams.

 Please tell us about your work as Associate Director, Discovery Data at MMV.

 In the context of a virtual R&D organization like MMV, it is important to ensure homogeneity in the way that data is collected, analyzed and reported. In my role, I have two main responsibilities. Firstly, as a compound manager, I ensure that molecules for testing are available at the right time and are of the right quality. Secondly, as a data manager, I check that the data generated by MMV’s various research partners are comparable, by making sure, for example, that all results are reported in the same standardized units. Day to day, I have certain ‘preventive’ duties, such as formulating rules, processes and guidelines relating to the database, as well as providing appropriate training to both internal and external end-users. In addition, I have ‘corrective’ duties, such as taking actions to correct processes or workflows whenever MMV’s quality control guidelines are not followed properly.

What is the importance of databases in research?

Data are the currency of drug discovery, and MMV’s database is the bank where we save these data. The importance of MMV’s database is that it stores, as well as connects, all chemical and biological information in a single place. When information is well connected, it is possible for a scientist to better understand how modifying the structure of a particular chemical compound is likely to impact its biological activity. The information kept in a database effectively forms a kind of ‘roadmap’, which explains how and why a particular project reached a particular decision point. It is critical for this information to be easily extractable – and in a format that is useful for specific users. For example, we have developed a specific method of extracting preclinical data from the database, in a format that helps the translational team to predict the human dose of a particular compound. Insights like these are invaluable to the progress of the portfolio.

What tools does MMV use to manage its data?

MMV’s data repository is called ScienceCloud, which, as its name suggests, is a cloud-based solution developed commercially that stores all data generated by MMV’s research partners. A second tool, the Logistics Management Tool, was developed in-house to help manage compound logistics. Developed initially for tracking compounds internally, the tool will be further modified to grant access to external users to facilitate communication. Both databases are integrated with one another to provide MMV’s discovery and translational teams with up-to-date information on particular compounds or assays.

How do you guarantee the quality of the data you receive?

MMV’s Discovery team ensures the quality of the data it collects and stores in its database by applying basic rules – defined by the acronym QUARTZ. This internal checklist allows us to confirm that data are: 1) of the expected Quality 2) Useful to the projects 3) Accessible to the end-user 4) Relevant for decision-making 5) Traceable to their origin 6) standardiZed according to MMV’s guidelines. If data adhere to these six principles, we can be confident that the data are of an acceptable quality.