Annotating regions of interest in medical images, a process known as segmentation, is often one of the first steps clinical researchers take when conducting a new study involving biomedical images.
For example, to determine how the size of the brain’s hippocampus changes as patients age, the scientist first describes each hippocampus in a series of brain scans. For many structures and image types, this is often a manual process that can be time-consuming, especially if the regions being studied are difficult to delineate.
To streamline the process, MIT researchers developed an artificial intelligence-based system that allows a researcher to quickly segment new biomedical imaging data sets by clicking, doodling, and drawing boxes on the images. This new AI model uses these interactions to predict segmentation.
As the user tags additional images, the number of interactions they must perform decreases, eventually dropping to zero. The model can then segment each new image precisely without user intervention.
This is possible because the model architecture was specifically designed to use information from already segmented images to make new predictions.
Unlike other medical image segmentation models, this system allows the user to segment an entire dataset without repeating work for each image.
Additionally, the interactive tool does not require a pre-segmented image dataset for training, so users do not need machine learning expertise or extensive computing resources. They can use the system for a new segmentation task without retraining the model.
In the long term, this tool could accelerate studies of new treatment methods and reduce the cost of clinical trials and medical research. Doctors could also use it to improve the efficiency of clinical applications, such as radiotherapy planning.
“Many scientists may only have time to segment a few images per day for their research because manually segmenting images is time-consuming. We hope this system will enable new science by allowing clinical researchers to conduct studies that they were previously prohibited from doing due to the lack of an effective tool,” says Hallee Wong, a graduate student in electrical engineering and computer science and lead author of a research paper. article on this new tool.
She is joined on the article by Jose Javier Gonzalez Ortiz PhD ’24; John Guttag, Dugald C. Jackson Professor of Computer Science and Electrical Engineering; and lead author Adrian Dalca, assistant professor at Harvard Medical School and MGH, and researcher at the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). The research will be presented at the International Conference on Computer Vision.
Streamlining segmentation
Researchers mainly use two methods to segment new sets of medical images. Using interactive segmentation, they input an image into an AI system and use an interface to mark areas of interest. The model predicts segmentation based on these interactions.
A tool previously developed by MIT researchers, Doodle Promptallows users to do this, but they must repeat the process for each new image.
Another approach is to develop a task-specific AI model to automatically segment images. This approach requires the user to manually segment hundreds of images to create a dataset and then train a machine learning model. This model predicts the segmentation of a new image. But the user must start the complex machine learning-based process from scratch for each new task, and there is no way to correct the model if it makes a mistake.
This new system, MultiverseSegcombines the best of each approach. It predicts segmentation for a new image based on user interactions, such as scribbles, but also keeps each segmented image in a set of context that it later refers to.
When the user uploads a new image and marks areas of interest, the model draws on the examples in its context set to make a more accurate prediction, with less user input.
The researchers designed the model architecture to use a set of contexts of any size, so the user does not need a certain number of images. This gives MultiverSeg the flexibility to be used in a range of applications.
“At some point, for many tasks, you should no longer need to provide interactions. If you have enough examples in the defined context, the model can accurately predict the segmentation on its own,” says Wong.
The researchers carefully designed and trained the model on a diverse collection of biomedical imaging data to ensure it had the ability to incrementally improve its predictions based on user input.
The user does not need to retrain or customize the model for their data. To use MultiverSeg for a new task, one can upload a new medical image and start tagging it.
When researchers compared MultiverSeg to state-of-the-art tools for contextual and interactive image segmentation, it outperformed every benchmark.
Fewer clicks, better results
Unlike these other tools, MultiverSeg requires less user input with each image. By the ninth new image, it only took two clicks from the user to generate a more accurate segmentation than a model designed specifically for this task.
For some types of images, such as x-rays, the user may only need to manually segment one or two images before the model becomes accurate enough to make predictions on its own.
The tool’s interactivity also allows the user to make corrections to the model’s prediction, iterating until it reaches the desired level of accuracy. Compared to the researchers’ previous system, MultiverSeg achieved 90 percent accuracy with about 2/3 the number of scribbles and 3/4 the number of clicks.
“With MultiverSeg, users can always provide more interactions to refine the AI’s predictions. This further speeds up the process significantly because it is generally faster to fix something that exists than to start from scratch,” says Wong.
In the future, the researchers want to test this tool in real-world situations with clinical collaborators and improve it based on user feedback. They also want to enable MultiverSeg to segment 3D biomedical images.
This work is supported in part by Quanta Computer, Inc. and the National Institutes of Health, with material support from the Massachusetts Life Sciences Center.