00:00:00

Agentless

https://dsdanielpark.github.io https://github.com/dsdanielpark

Agentless

MinWoo(Daniel) Park | Tech Blog

Created: 2024-08-06 07:32:42 +0000

Last modified: 2024-09-05 20:56:50 +0900

Agentless

Related Project: Private
Category: Paper Review
Date: 2024-07-01

Agentless: Demystifying LLM-based Software Engineering Agents

url: https://arxiv.org/abs/2407.01489
pdf: https://arxiv.org/pdf/2407.01489
html: https://arxiv.org/html/2407.01489v1
github: https://github.com/OpenAutoCoder/Agentless
abstract: Recent advancements in large language models (LLMs) have significantly advanced the automation of software development tasks, including code synthesis, program repair, and test generation. More recently, researchers and industry practitioners have developed various autonomous LLM agents to perform end-to-end software development tasks. These agents are equipped with the ability to use tools, run commands, observe feedback from the environment, and plan for future actions. However, the complexity of these agent-based approaches, together with the limited abilities of current LLMs, raises the following question: Do we really have to employ complex autonomous software agents? To attempt to answer this question, we build Agentless – an agentless approach to automatically solve software development problems. Compared to the verbose and complex setup of agent-based approaches, Agentless employs a simplistic two-phase process of localization followed by repair, without letting the LLM decide future actions or operate with complex tools. Our results on the popular SWE-bench Lite benchmark show that surprisingly the simplistic Agentless is able to achieve both the highest performance (27.33%) and lowest cost ($0.34) compared with all existing open-source software agents! Furthermore, we manually classified the problems in SWE-bench Lite and found problems with exact ground truth patch or insufficient/misleading issue descriptions. As such, we construct SWE-bench Lite-S by excluding such problematic issues to perform more rigorous evaluation and comparison. Our work highlights the current overlooked potential of a simple, interpretable technique in autonomous software development. We hope Agentless will help reset the baseline, starting point, and horizon for autonomous software agents, and inspire future work along this crucial direction.

GitHub Repository 체크

Contents

Agentless: Demystifying LLM-based Software Engineering Agents
- TL;DR

TL;DR

복잡한 소프트웨어 엔지니어링 문제를 해결하기 위한 새로운 접근 방식, ‘Agentless’ 제안
LLM 기반 도구를 사용하여 실제 문제에 대한 개발 과제 수행
기존 에이전트 기반 방식과 비교하여 간소화된 접근 방식을 통한 높은 성능 및 낮은 비용 달성

ScreenShotToCode, CoT 등으로 LLM을 에이전트처럼 활용하는 방안(feat. divide and conqure agent version, 코파일럿 등 에이전트의 평가 방안 등)

1. 서론

최신 대규모 언어모델(LLM)은 사용자의 설명을 바탕으로 코드 조각을 생성하는 능력을 입증하였습니다. 그러나, 저장소 수준의 소프트웨어 엔지니어링 작업에 LLM을 적용하는 것은 상대적으로 덜 연구되었습니다. 이런 작업은 단순 파일 내 정보뿐만 아니라 파일 간의 종속성을 이해하는 것을 요구합니다.

SWE-bench 벤치마크

최근, 실제 소프트웨어 엔지니어링 문제를 자동으로 해결할 수 있는 도구의 능력을 평가하기 위해 SWE-bench 벤치마크가 개발되었습니다. 이 벤치마크는 GitHub의 실제 문제 설명과 해당 Python 저장소를 포함한 문제로 구성되어 있습니다.

2. Agentless 접근 방법

문제 해결을 위해 위치 지정 및 수정 사용

(1) 위치 지정 단계에서는 오류가 있는 파일, 클래스, 함수를 식별
(2) 수정 단계에서는 다양한 후보 패치를 생성하고 선택적으로 적용

위치 지정과 수정 과정

Agentless는 먼저 문제 설명과 기존 프로젝트 코드베이스를 입력으로 받습니다. 위치 지정 과정에서는 저장소 구조 형식을 사용하여 문제가 있는 파일을 식별합니다. 이후, 문제가 있는 파일 내에서 관련 클래스나 함수를 좁혀 나가며 최종 수정 위치를 지정합니다.

수정 단계에서는 수정 위치에서 코드 조각과 함께 문제 설명을 LLM에 제공하고, 다양한 수정안을 생성합니다. 생성된 패치 중 문법 오류가 없고 기존 테스트를 통과하는 패치를 선택하여 적용합니다.

3. 실험 설정 및 평가

데이터셋

Agentless는 SWE-bench Lite 데이터셋에서 평가되었으며, 이 데이터셋은 300개의 문제로 구성되어 있습니다. 각 문제는 입력 문제 설명을 기반으로 패치를 제출하는 것을 요구합니다.

구현

Agentless는 GPT-4를 사용하여 구현되었습니다. 문제를 해결하기 위해, 먼저 상위 세 개의 의심 파일을 식별한 다음, 해당 파일 내의 클래스와 함수를 지정합니다. 수정 위치를 찾기 위해 샘플링 방법을 사용하며, 최종적으로 여러 후보 패치 중 하나를 선택합니다.

평가

Agentless는 개방형 소스 접근 방식 중 가장 높은 성능을 달성하였으며, 비용도 상대적으로 낮았습니다. 다양한 에이전트 기반 접근 방식과 비교했을 때, Agentless는 단순하지만 효과적인 설계로 인해 높은 경쟁력을 유지하고 있습니다.

4. 추가 분석

SWE-bench Lite 데이터셋을 세분화하여 문제 유형을 분류하고, 각 문제 유형에 따른 해결 가능성을 평가했습니다. 문제 설명의 질, 해결책의 포함 여부, 위치 정보의 제공 여부 등 다양한 차원에서 분석을 수행했습니다.

5. 관련 연구

LLM을 활용한 코드 생성 및 수정에 대한 다양한 연구가 소개되어 있습니다. 이런 연구는 LLM의 성능을 향상시키기 위해 특정 도메인의 코드 스니펫을 사용하여 추가적인 학습을 진행하는 경우가 많습니다. 또한, 실제 저장소 수준에서의 소프트웨어 엔지니어링 문제를 해결하기 위한 벤치마크가 개발되었습니다.

Agentless는 이런 연구와 벤치마크를 기반으로, 보다 단순하고 이해하기 쉬운 방식으로 실제 문제를 해결할 수 있는 방법을 제시합니다.

post contain ""

No matching posts found containing ""

Agentless

Agentless

Agentless

Agentless: Demystifying LLM-based Software Engineering Agents

TL;DR

post contain ""

No matching posts found containing ""

Recent Posts

Most Likes

Most Views

Share Your Feedback 🏝️

Agentless

Agentless

Agentless: Demystifying LLM-based Software Engineering Agents

TL;DR

post contain ""

No matching posts found containing ""

Recent Posts

Most Likes

Most Views