We just released a report on how data curation alone can increase VLM quality across 20 public benchmarks.