Back to Releases

Janus Pro 7B

Feature Release

2025-01-27

Major update focusing on AI integration and user experience improvements

Highlights

Multimodal understanding and visual generation results from Janus Pro.
Comparison of text-to-image generation between Janus Pro and its predecessor, Janus.
Architecture of Janus Pro.

Multimodal Understanding

Integrates and processes data from both visual and textual inputs.
Evaluates performance based on multimodal benchmarks like POPE, MME-Perception, GQA, and MMMU.
Scales MME-Perception scores to the range [0, 100] for better interpretability.
Provides a comprehensive evaluation across various metrics for improved understanding of complex multimodal data.

Visual Generation

Evaluates visual generation on two key benchmarks: GenEval and DPG-Bench.
Assesses the model's ability to follow instructions and generate relevant visual content.
Improves upon previous state-of-the-art models, including both unified multimodal models and task-specific models.
Targets high performance in instruction-based visual generation tasks.

Unified Multimodal Model Performance

Combines multiple models into a single, cohesive framework for enhanced multimodal understanding.
Outperforms previous state-of-the-art models across both multimodal understanding and visual generation.
Demonstrates superior capabilities in handling complex multimodal tasks with more accurate results.

User Experience Optimization

Optimizes display for better viewing experience, particularly for evaluation metrics and results.
Ensures that the performance benchmarks and evaluation results are best viewed on screen for clearer insights.
Focuses on delivering a visually engaging and user-friendly interface for interaction with model outputs.