Home

Papers Read on AI

Keeping you up to date with the latest trends and best performing architectures in this fast evolvin

Episodes

Total: 205

Enabling large language models to utilize real-world tools effectively is crucial for achieving embo

GPT-4o, an all-encompassing model, represents a milestone in the development of large multi-modal la

Recent advances in latent diffusion-based generative models for portrait image animation, such as Ha

This paper introduces F5-TTS, a fully non-autoregressive text-to-speech system based on flow matchin

Retrieval-Augmented Generation (RAG) systems enhance large language models (LLMs) by integrating ext

Information comes in diverse modalities. Multimodal native AI models are essential to integrate real

We propose an intuitive LLM prompting framework (AgentKit) for multifunctional agents. AgentKit offe

Document understanding is a challenging task to process and comprehend large amounts of textual and

In a convergence of machine learning and biology, we reveal that diffusion models are evolutionary a

The potential effectiveness of counterspeech as a hate speech mitigation strategy is attracting incr

Large language models (LLMs) often produce errors, including factual inaccuracies, biases, and reaso

Large language models (LLMs) often exhibit deficient reasoning or generate hallucinations. To addres

On the Diagram of Thought

2024/10/2

We introduce Diagram of Thought (DoT), a framework that models iterative reasoning in large language

The increasing demand for high-quality 3D assets across various industries necessitates efficient an

Tuning-free personalized image generation methods have achieved significant success in maintaining f

Agent-based modeling (ABM) seeks to understand the behavior of complex systems by simulating a colle

In many modern LLM applications, such as retrieval augmented generation, prompts have become program

We propose Pure and Lightning ID customization (PuLID), a novel tuning-free ID customization method

Retrieval-Augmented Generation (RAG) leverages retrieval tools to access external databases, thereby

We propose Pure and Lightning ID customization (PuLID), a novel tuning-free ID customization method