There is rarely a single "correct" answer in a design interview. The value lies in explaining why you chose batch serving over real-time serving, or why you chose a simple linear model over an expensive transformer architecture given the latency budget.
To appreciate this book, you must first understand why the interview itself is so brutal. machine learning system design interview pdf alex xu