Robust and Scalable Recognition of Objects and Events in Video