Our contributions to the scientific community through peer-reviewed research and publications.
We present ModalFusion, a novel architecture that enables seamless integration and understanding across multiple modalities including text, images, audio, and video. Our approach leverages a shared latent space and modality-specific encoders to create unified representations that preserve the unique characteristics of each modality while enabling cross-modal reasoning and generation.
Impact:
Established a new state-of-the-art for multi-modal understanding tasks, with a 23% improvement over previous methods on the MultiModal Benchmark.
This paper introduces HARMONY, a hierarchical reinforcement learning framework that enables agents to learn complex behaviors through a combination of hierarchical policy decomposition, memory-augmented decision making, and ontological knowledge representation. Our approach significantly improves sample efficiency and generalization in complex environments requiring long-term planning and reasoning.
Impact:
Reduced training time by 78% while improving performance by 45% on complex long-horizon tasks compared to non-hierarchical approaches.
We propose KnowledgeGPT, a novel approach for integrating structured knowledge bases with large language models. Our method enables more factually accurate responses by grounding language generation in verified knowledge sources while maintaining the flexibility and generative capabilities of large language models. We demonstrate significant improvements in factuality and reasoning on knowledge-intensive tasks.
Impact:
Reduced factual errors by 67% compared to base LLMs while maintaining or improving fluency and coherence across all tested domains.
This paper presents ScaleML, a distributed training framework designed specifically for trillion-parameter AI models. We introduce novel techniques for memory management, communication optimization, and pipeline parallelism that enable efficient training of extremely large models across thousands of accelerators. Our system achieves near-linear scaling efficiency up to 1024 GPUs.
Impact:
Enabled training of a 1.2 trillion parameter model with 92% scaling efficiency on 1024 GPUs, reducing training time from months to days.
This paper investigates how humans form mental models of AI systems and how these models influence collaboration effectiveness. Through a series of user studies involving 250 participants, we identify key factors that shape mental model formation and propose design principles for AI systems that facilitate accurate mental model development, leading to more effective human-AI collaboration.
Impact:
Established new design guidelines for explainable AI that have been adopted by multiple industry partners, improving user satisfaction by 42% in collaborative AI systems.
We present a novel approach for teaching robots complex manipulation skills from human demonstrations. Our method combines imitation learning with reinforcement learning to enable robots to acquire dexterous manipulation capabilities that generalize to novel objects and situations. We demonstrate our approach on a range of challenging tasks including in-hand manipulation, tool use, and assembly operations.
Impact:
Achieved a 3.5x improvement in success rate for complex manipulation tasks compared to previous state-of-the-art methods, with successful transfer to real robotic systems.
This paper introduces a novel approach to safe reinforcement learning that guarantees constraint satisfaction during both training and deployment. Our method combines formal verification techniques with reinforcement learning to create a shielding mechanism that prevents unsafe actions while minimally interfering with the learning process. We demonstrate our approach in safety-critical domains including autonomous driving and healthcare.
Impact:
Enabled zero-violation learning in safety-critical domains while achieving 94% of the performance of unconstrained RL approaches.
We present a novel approach for extending the capabilities of transformer-based language models to low-resource languages. Our method leverages cross-lingual transfer learning and data augmentation techniques to enable effective language understanding and generation in languages with limited training data. We evaluate our approach on 43 languages across multiple tasks.
Impact:
Extended effective NLP capabilities to 43 low-resource languages, achieving an average 31% improvement over previous methods across all evaluated tasks.