How to Design a Roadmap for a Machine Learning Project
Each of the three approaches has its merits. If the new project is quite similar to something that has previously been modeled (both the data and the task), trying out modeling approaches that have already been implemented can be a very quick way to establish a baseline for the task. In doing so, you may also discover new challenges that must be accommodated in data preprocessing or modeling.
This might lead you into #2: exploring and understanding the data. Or you might have started here. Recognizing the unique needs of a new dataset is essential. Perhaps preprocessing or annotation needs to be handled differently. Maybe there are artifacts in the data that need to be cleaned up or the labels aren’t always correct. Understanding the challenges that preprocessing and modeling will need to contend with is essential.
But the step that some teams miss and is the most critical in setting a project up for success is a literature search. Has someone else modeled a similar task on similar data? If the type of data you’re working with is common, then you might be able to apply a very strict definition of “similar.” But if you’re working with a new imaging modality, for example, or tackling a new task, you might need to relax the definition of “similar” to find relevant research.
0 Comments