feat: 부하 테스트 자동화 구성 by lsy1307 · Pull Request #31 · solid-connection/solid-connection-infra

lsy1307 · 2026-05-05T16:18:12Z

변경 내용

부하 테스트 인프라를 GitHub Actions에서 생성하고 종료하도록 구성
prod RDS 최신 자동 스냅샷에서 load-test RDS를 복원하도록 변경
k6 부하 생성은 stage EC2가 아닌 별도 load-generator EC2에서 실행하도록 구성
k6 실행 입력값을 GitHub Actions 파라미터로 받을 수 있도록 구성
workflow 입력값은 env로 전달해 shell injection 위험을 줄이도록 수정
load-generator EC2에 IMDSv2를 강제하도록 수정
Prometheus remote-write URL 하드코딩을 제거하고, URL이 있을 때만 k6 remote-write output을 활성화하도록 수정
xk6 및 Prometheus remote-write extension 버전을 고정하도록 수정
레거시 dump/restore 및 로컬 Docker DB 세팅 스크립트 제거
현재 Start/Run/Stop 흐름에 맞춰 load test README 갱신

현재 플로우

Load Test Start
- Terraform으로 load-test RDS와 load-generator EC2를 생성합니다.
- load-test RDS는 prod RDS 최신 자동 스냅샷에서 복원합니다.
- datasource URL은 load-test RDS endpoint로 기록하고, username/password는 prod datasource Parameter Store 값을 복사해 사용합니다.
- 선택적으로 stage 앱을 dev,loadtest profile로 재기동합니다.
Load Test Run
- Terraform output에서 load-generator EC2와 기본 실행값을 읽습니다.
- SSM RunCommand로 k6 스크립트와 입력 JSON을 EC2에 동기화합니다.
- set_up_xk6.sh로 Prometheus remote-write 지원 k6를 준비합니다.
- whole-user-flow.js를 실행합니다.
- vus, iterations, max_duration, target_base_url, prometheus_remote_write_url은 workflow 입력값으로 조정할 수 있습니다.
- prometheus_remote_write_url이 비어 있고 Terraform 기본값도 비어 있으면 remote-write 전송은 비활성화됩니다.
Load Test Stop
- 필요하면 stage 앱을 dev datasource 구성으로 복구합니다.
- 필요하면 Terraform destroy로 load-test RDS와 load-generator EC2를 제거합니다.

테스트

terraform fmt -check -recursive environment/load_test
terraform -chdir=environment/load_test validate
bash -n scripts/load_test/start.sh
bash -n scripts/load_test/run_k6.sh
bash -n scripts/load_test/stop.sh
bash -n config/load-test/k6/set_up_xk6.sh
node --input-type=module --check config/load-test/k6/whole-user-flow.js
git diff --check

- 상세내용: 부하 테스트 실행에 필요한 secret submodule 변경 커밋을 상위 인프라 저장소에 반영

- 상세내용: 부하 테스트용 RDS, 보안 그룹, SSM datasource 파라미터를 Terraform으로 정의 - 상세내용: prod/stage EC2 보안 그룹에서 loadtest RDS 3306 접근을 허용하도록 구성

- 상세내용: start.sh에서 RDS 생성, stage 전환, prod 데이터 복사를 자동화 - 상세내용: stop.sh에서 stage 원복과 loadtest RDS destroy 흐름을 제공 - 상세내용: Windows와 macOS/Linux 실행 환경에서 사용할 bash 기반 절차를 문서화

coderabbitai · 2026-05-05T16:18:20Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a5c5276b-758b-492c-8e38-83b8e4faa7b4

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (1 warning, 2 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 19.44% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Linked Issues check	❓ Inconclusive	연결된 이슈 `#19는` TODO 항목만 있고 구체적인 요구사항이 명확하지 않아 정확한 규정 준수 평가가 어렵습니다.	이슈 `#19의` TODO 항목들을 구체적인 체크리스트로 명확히 하거나 PR과 함께 검토하는 것이 필요합니다.
Description check	❓ Inconclusive	PR 설명이 변경 내용, 현재 플로우, 테스트 항목을 포함하지만 템플릿의 필수 섹션이 부분적으로 누락되었습니다.	관련 이슈(resolves `#19`)는 명시되어 있으나, 특이 사항과 리뷰 요구사항 섹션이 명확하게 구분되어 있지 않습니다. 템플릿 형식을 더 명확히 따르는 것을 권장합니다.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	PR 제목은 '부하 테스트 자동화 구성'으로 변경사항의 주요 내용을 명확하게 요약하고 있습니다.
Out of Scope Changes check	✅ Passed	모든 변경사항이 부하 테스트 자동화 인프라 구성이라는 범위 내에 있으며, 관련 없는 외부 변경은 없습니다.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch 19-feat-loadtest-rds-parameter-store

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

- 상세내용: workflow_dispatch로 부하 테스트 시작과 종료를 수동 실행할 수 있도록 워크플로우 추가 - 상세내용: stage 서버 전환과 원복을 SSH 대신 SSM RunCommand로 수행하도록 변경 - 상세내용: SSH key 입력 없이 OIDC 기반 AWS Role과 GH_PAT submodule checkout 흐름을 사용하도록 문서화

- 상세내용: monitor repo의 k6 파일을 infra repo에 포함해 stage EC2 cloud-init에서 배치하도록 구성 - 상세내용: app_stack module에 k6 파일 배치 옵션을 추가하고 stage 환경에서만 활성화 - 상세내용: 부하 테스트 README를 한글로 변경하고 GitHub Actions 실행 흐름을 정리

github-actions · 2026-05-05T16:56:21Z

Terraform Plan: `stage`

No changes. Your infrastructure matches the configuration.

전체 plan 결과는 보안을 위해 댓글에 포함되지 않습니다. 워크플로우 실행 아티팩트를 확인하세요.

github-actions · 2026-05-05T16:56:48Z

Terraform Plan: `prod`

No changes. Your infrastructure matches the configuration.

전체 plan 결과는 보안을 위해 댓글에 포함되지 않습니다. 워크플로우 실행 아티팩트를 확인하세요.

github-actions · 2026-05-05T16:56:58Z

@coderabbitai review

Hexeong

고생하셨습니다! 궁금한 점 질문드립니다.

부하테스트 환경을 깃헙 액션으로 생성하고, 다른 깃헙 액션으로 부하테스트를 실행하는 것으로 이해했습니다! 부하테스트를 진행할때 부하를 생성하는 깃헙 러너도 사양이 좋아야 vuser에 대한 설정이 잘 반영되는 것으로 알고 있는데 해당 러너의 사양이 충분한지 궁금합니다!
두번째로는 현재 보이는 양상으로는 부하테스트를 진행할 때, updatePost.json 과 같은 입력값을 파일로써 넘겨주어 실행하는 방식으로 보이는데 깃헙 액션을 실행할 때 개발자가 파라미터를 입력해서 실행할 수 있는 방법은 없을까요? 만약 이렇게 된다면 좀 더 유연한 부하테스트 실행이 될 것 같습니다!

coderabbitai

Actionable comments posted: 13

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

environment/load_test/main.tf (1)
1-145: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

environment/load_test 환경이 Terraform 자동 검증 파이프라인에서 누락되었습니다

terraform-plan.yml의 detect-changes job (lines 17-57)에 load_test 환경이 필터로 정의되지 않았고, 대응하는 plan-load_test job도 없습니다. 그 결과 PR에서 environment/load_test/*.tf 변경이 발생했지만 자동 계획 검증이 실행되지 않았으며, 예상치 못한 리소스 파괴/대체 여부를 검증할 수 없습니다.

필수 조치:

.github/workflows/terraform-plan.yml에 load_test 필터와 plan-load_test job을 추가하여 environment/load_test/** 변경을 감지하도록 구성해야 합니다.

코딩 가이드라인 **/*.tf: "PR 댓글에 올라온 각 환경의 'Terraform Plan' 결과를 반드시 확인"에 따라 load_test plan 결과가 PR 코멘트에 포함되어야 합니다.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@environment/load_test/main.tf` around lines 1 - 145, The detect-changes
workflow is missing the load_test environment so changes under
environment/load_test/** are not caught; update the detect-changes job (the job
named detect-changes) to include a path filter for environment/load_test/** and
add a corresponding plan-load_test job (modeled after existing plan-* jobs) that
runs the Terraform init/plan for the load_test workspace and posts plan output
to the PR; ensure the job name is plan-load_test and it references the same
steps/variables (workspace, backend config, ssm/kms variables) used by other
plan jobs so the new job is executed when files in environment/load_test/**
change and its plan gets commented on the PR.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@config/load-test/k6/set_up_xk6.sh`:
- Around line 15-25: The Prometheus remote write URL and trend-stats export are
inconsistent between the current shell and the lines being appended to the login
shell; update set_up_xk6.sh so the echoed ~/.bashrc lines use the same
K6_PROMETHEUS_RW_SERVER_URL value as the current shell (use the existing
K6_PROMETHEUS_RW_SERVER_URL variable rather than hardcoding a different IP) and
add the missing export for K6_PROMETHEUS_RW_TREND_STATS (export
K6_PROMETHEUS_RW_TREND_STATS="p(90),p(95),p(99),avg,min,max") so child processes
like k6 receive the setting.

In `@config/load-test/k6/whole-user-flow.js`:
- Around line 333-366: searchUniversities(), getLanguageTests(), and getGPAs()
may return empty or unexpected responses and the code immediately dereferences
ids (e.g., uniList[Math...].id, langList[0].id, gpaList[0].id), causing runtime
TypeError; modify the flow in whole-user-flow.js to validate the HTTP response
and parsed body before indexing: call .json() then check that the arrays
(uniList, languageTestScoreStatusResponseList, gpaScoreStatusResponseList) exist
and have length > 0, and if not call k6.fail() (or otherwise record a clear
failure) with a descriptive message including the function name and any
status/error info; update the calls around searchUniversities, getLanguageTests,
and getGPAs to perform these checks and only proceed to extract .id when
present.

In `@environment/load_test/main.tf`:
- Around line 1-10: The data sources data.aws_vpc.default and
data.aws_subnets.default must not rely on default = true; instead select the VPC
and its subnets by the same criteria as your stage/prod instances (e.g., filter
by the environment tag, or derive vpc_id from a representative
EC2/data.aws_instance used by stage/prod) so the load-test DB ends up in the
same VPC; change data "aws_vpc" "default" to a filtered lookup (remove
default=true and add filters like tag:Environment or id =
data.aws_instance.<name>.vpc_id) and update data "aws_subnets" "default" to use
values = [data.aws_vpc.selected.id]; apply the same pattern for the other
occurrences noted.
- Around line 12-35: The Terraform changes under environment/load_test are not
included in the PR auto-validate because terraform-plan.yml's detect-changes
filter omits that path; update terraform-plan.yml to include
"environment/load_test/**" in the detect-changes paths so changes to data
"aws_instance" "prod_api" and data "aws_instance" "stage_api" trigger plan runs,
or alternatively modify the load_test Terraform to avoid ambiguous name-based
lookups by accepting instance IDs as variables and replacing the tag-based data
sources with direct aws_instance lookups by ID to prevent apply-time failures
when multiple instances share the same Name tag.

In `@environment/stage/main.tf`:
- Line 45: 현재 enable_k6_files = true 만 설정하면 cloud-init user data 변경이 기존 stage
EC2에 반영되지 않습니다; locate modules/app_stack/ec2.tf and the aws_instance.api_server
resource which currently has user_data_replace_on_change = false and
lifecycle.ignore_changes that includes user_data, and either (A) set
user_data_replace_on_change = true and remove user_data from
lifecycle.ignore_changes so the instance will be recreated/updated with the k6
files, or (B) keep instance untouched and add an explicit file/SSM sync step to
copy files into /home/ubuntu/solid-connection-load-test/k6 (or document that
instance recreation is required) depending on whether you want automatic
redeploy or an out-of-band deployment.

In `@scripts/load_test/README.md`:
- Around line 33-36: 문서 문구가 로컬 실행으로 읽히므로 "environment/load_test에서 terraform
apply" 문장을 GitHub Actions가 실행함을 명시하도록 수정하세요: README의 해당 항목(현재 "1.
`environment/load_test`에서 `terraform apply`를 실행합니다.")을 "GitHub Actions가
`environment/load_test`에서 `terraform apply`를 실행합니다."로 바꾸고, 필요하면 한 줄로 '환경/*.tf
파일은 로컬에서 apply 금지, GitHub Actions로만 실행'이라는 규칙도 추가해 규정(환경 terraform 적용은 GitHub
Actions 전용)을 명확히 하세요.
- Around line 52-63: Update the deployment docs and automation so stage EC2
always has the k6 assets: either modify the Start workflow to run an SSM step
that syncs the repo k6 directory into /home/ubuntu/solid-connection-load-test/k6
(copy the files listed: createPost.json, updatePost.json, whole-user-flow.js,
set_up_xk6.sh, script/set-load-test.sh), or change the Actions/SSM job to
perform a repository checkout on the target and run k6 from that checked-out
path; update README.md to document which of these two approaches is implemented
and reference the cloud-init path `/home/ubuntu/solid-connection-load-test/k6`
and the setup scripts so reviewers can locate the change.
- Line 42: 현재 README 단계(SSM RunCommand로 prod EC2에서 `mysqldump` 실행 후 loadtest RDS
복원)는 전체 운영 DB를 그대로 복제하므로 개인정보 유출 리스크가 큽니다; 대신 `mysqldump` 호출을 전체 덤프로 유지하지 말고 데이터
마스킹/익명화 스크립트 또는 테이블/컬럼 필터링(필요한 테이블 subset만 덤프)으로 덤프를 생성하도록 변경하고, 복원 전 검증 단계에서 민감
필드(예: 사용자 식별자, 이메일, 전화번호 등)가 제거되었는지 확인하도록 자동화하세요; 또한 덤프 파일의 보존 기간과 자동 삭제(예: S3
수명주기나 EC2에서의 자동 삭제 스크립트)를 README의 절차와 SSM RunCommand 명세에 명시해 책임자를 고정하고 검증 로그를
남기도록 구성하십시오.

In `@scripts/load_test/start.sh`:
- Around line 228-232: The dump file DUMP_FILE can be left on /tmp if a later
command fails; after creating DUMP_FILE in the remote shell session (right after
the mysqldump command that sets DUMP_FILE), register a shell EXIT trap such as
trap 'rm -f "$DUMP_FILE"' EXIT so the temporary gzip file is removed on any exit
(success or failure); ensure the trap is set inside the same remote shell
context that creates and consumes DUMP_FILE and that the final explicit rm -f
"$DUMP_FILE" remains (the trap will be a safety net for error paths).
- Around line 6-11: The script currently hardcodes
DATABASE_NAME="solid_connection" (and similar hardcoded username/password param
names) which can drift from Terraform; update start.sh to fetch the DB name and
related parameters from Terraform outputs instead of hardcoding: call terraform
output (or read the exported load_test_db_name output) to set DATABASE_NAME and
use the corresponding Terraform outputs for LOADTEST_DB_USERNAME_PARAMETER and
LOADTEST_DB_PASSWORD_PARAMETER (and the prod equivalents) so the variables used
in the dump/restore logic (referencing DATABASE_NAME,
LOADTEST_DB_USERNAME_PARAMETER, LOADTEST_DB_PASSWORD_PARAMETER,
PROD_DB_USERNAME_PARAMETER, PROD_DB_PASSWORD_PARAMETER) always reflect the
current tf outputs.
- Around line 98-119: The SSM polling loop using status, command_id, and
instance_id has no overall timeout and can hang indefinitely; modify the loop to
enforce a maximum wait by adding either a max_attempts counter or
start_time/timeout check, incrementing attempts (or checking elapsed seconds)
each iteration, and if exceeded print the final get-command-invocation JSON for
command_id/instance_id and exit non‑zero; ensure the existing case branches
remain but replace the infinite while true with a bounded loop or a timeout
condition so Pending|InProgress|Delayed eventually aborts and returns the last
invocation result.
- Around line 142-166: The stage-switch block guarded by
SWITCH_STAGE_TO_LOADTEST currently runs before the SKIP_DATA_COPY block, causing
the stage app to restart to dev,loadtest and hit an incomplete/empty DB during
prod dump/restore; move the entire SWITCH_STAGE_TO_LOADTEST conditional (the
commands building stage_commands_json and the call to send_ssm_command that runs
docker compose up -d solid-connection-dev) to after the SKIP_DATA_COPY/data-copy
and restore logic (or alternatively ensure the stage remains down until restore
completes by issuing a docker compose down in that block and only bringing it up
after restore completion); update references to SWITCH_STAGE_TO_LOADTEST,
send_ssm_command, and the docker compose up/down commands accordingly so stage
is only started once data copy/restore finishes.

In `@scripts/load_test/stop.sh`:
- Around line 68-89: The polling loop in send_ssm_command() (the while true that
checks status for command_id and instance_id) lacks a timeout and can hang
indefinitely; add a configurable max wait (e.g., MAX_WAIT_SECONDS or
MAX_ITERATIONS) and track elapsed time or loop counts inside the loop, break and
treat as failure when exceeded, and on timeout call aws ssm
get-command-invocation for diagnostics and exit 1 with a clear message including
the timeout, command_id and instance_id.

---

Outside diff comments:
In `@environment/load_test/main.tf`:
- Around line 1-145: The detect-changes workflow is missing the load_test
environment so changes under environment/load_test/** are not caught; update the
detect-changes job (the job named detect-changes) to include a path filter for
environment/load_test/** and add a corresponding plan-load_test job (modeled
after existing plan-* jobs) that runs the Terraform init/plan for the load_test
workspace and posts plan output to the PR; ensure the job name is plan-load_test
and it references the same steps/variables (workspace, backend config, ssm/kms
variables) used by other plan jobs so the new job is executed when files in
environment/load_test/** change and its plan gets commented on the PR.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e653fd43-eafa-4c3d-ad1e-050891eed30e

📥 Commits

Reviewing files that changed from the base of the PR and between f85e08f and 9ef254a.

📒 Files selected for processing (18)

.github/workflows/load-test-start.yml
.github/workflows/load-test-stop.yml
config/load-test/k6/createPost.json
config/load-test/k6/script/set-load-test.sh
config/load-test/k6/set_up_xk6.sh
config/load-test/k6/updatePost.json
config/load-test/k6/whole-user-flow.js
config/secrets
environment/load_test/main.tf
environment/load_test/output.tf
environment/load_test/provider.tf
environment/load_test/variables.tf
environment/stage/main.tf
modules/app_stack/ec2.tf
modules/app_stack/variables.tf
scripts/load_test/README.md
scripts/load_test/start.sh
scripts/load_test/stop.sh

- 상세내용: load_test Terraform plan workflow를 추가했습니다. - 상세내용: loadtest RDS 네트워크를 stage EC2 VPC 기준으로 생성하도록 수정했습니다. - 상세내용: SSM 명령 timeout, dump cleanup, k6 파일 동기화, 데이터 복사 후 stage 전환 순서를 반영했습니다. - 상세내용: k6 설정과 응답 검증 오류를 수정했습니다.

github-actions · 2026-05-06T05:34:22Z

Terraform Plan: `load_test`

Plan: 8 to add, 0 to change, 0 to destroy.

Full plan output is kept in the workflow artifact for security. Check workflow run artifact.

- 상세내용: 임시로 생성되는 load_test 환경을 PR Terraform plan 대상에서 제외했습니다. - 상세내용: load_test apply와 destroy는 수동 GitHub Actions workflow에서만 실행하도록 정리했습니다.

- 상세내용: load_test Terraform에 k6 전용 EC2와 보안 그룹을 추가했습니다. - 상세내용: stage EC2에는 k6 파일을 배치하지 않도록 app_stack cloud-init 구성을 제거했습니다. - 상세내용: k6 실행에 필요한 기본값은 secret이 아닌 Terraform 기본값과 output으로 관리하도록 정리했습니다.

- 상세내용: Load Test Run workflow를 추가해 k6 전용 EC2에서 부하를 생성하도록 했습니다. - 상세내용: loadtest workflow가 전용 AWS_LOAD_TEST_ROLE_ARN 변수를 사용하도록 분리했습니다. - 상세내용: start 스크립트에서 stage k6 동기화를 제거하고 생성된 부하 생성 EC2 정보를 출력하도록 수정했습니다.

- 상세내용: Prometheus remote-write 설정을 환경 변수 기반으로 일관되게 export하도록 수정했습니다. - 상세내용: k6 VU, iteration, duration, target URL을 실행 시 주입할 수 있도록 변경했습니다. - 상세내용: 대학, 어학 점수, GPA 응답이 비어 있을 때 명확히 fail하도록 검증을 추가했습니다.

- 상세내용: Start, Run, Stop workflow 기준의 부하 테스트 실행 흐름을 문서화했습니다. - 상세내용: secret에 새로 추가할 값이 없고 민감하지 않은 값은 workflow 입력과 기본값으로 관리한다는 점을 명시했습니다. - 상세내용: stage EC2가 아닌 k6 전용 EC2에서 부하를 생성하는 구조를 설명했습니다.

- HTTP 실패와 JSON 파싱 실패를 명시적으로 처리하도록 수정 - 빈 배열과 누락된 id를 역참조 전에 검증하도록 추가

- prod RDS 최신 자동 스냅샷에서 load-test RDS를 복원하도록 수정 - 덤프 복제 입력과 스크립트 로직을 제거하도록 수정 - load generator 타입을 c7i.large로 고정하도록 수정 - 현재 Start/Run/Stop 흐름에 맞춰 README를 갱신하도록 수정

- 스냅샷 복원 방식에 맞춰 갱신된 secrets submodule 커밋을 참조하도록 수정

lsy1307 · 2026-05-28T17:02:34Z

고생하셨습니다! 궁금한 점 질문드립니다.

부하테스트 환경을 깃헙 액션으로 생성하고, 다른 깃헙 액션으로 부하테스트를 실행하는 것으로 이해했습니다! 부하테스트를 진행할때 부하를 생성하는 깃헙 러너도 사양이 좋아야 vuser에 대한 설정이 잘 반영되는 것으로 알고 있는데 해당 러너의 사양이 충분한지 궁금합니다!

두번째로는 현재 보이는 양상으로는 부하테스트를 진행할 때, updatePost.json 과 같은 입력값을 파일로써 넘겨주어 실행하는 방식으로 보이는데 깃헙 액션을 실행할 때 개발자가 파라미터를 입력해서 실행할 수 있는 방법은 없을까요? 만약 이렇게 된다면 좀 더 유연한 부하테스트 실행이 될 것 같습니다!

부하 생성은 GitHub Runner가 아니라 별도 EC2에서 수행하도록 변경했습니다. GitHub Actions는 load-test 인프라 생성과 SSM 명령 전달만 담당하고, 실제 k6 실행은 c7i.large 타입의 load-generator EC2에서 수행합니다. 그래서 GitHub Runner 사양이 VU 설정 반영에 병목이 되지는 않습니다.
VU, iterations, max duration, target base URL, Prometheus remote-write URL은 Load Test Run workflow input으로 받도록 구성했습니다. 다만 createPost.json, updatePost.json 같은 요청 body 파일은 현재 repo 파일을 load-generator EC2로 동기화해 사용합니다. 요청 payload까지 workflow input으로 열면 JSON escaping과 검증이 복잡해져서, 현재는 테스트 시나리오/데이터 파일은 코드 리뷰 가능한 파일로 관리하고 실행 파라미터만 Actions input으로 조절하는 방식으로 두었습니다. 점진적으로 자주 사용하는 JSON파일을 만들어놓고 Actions에서는 어떤 파일을 실행할지 선택하는 방식으로 진행하면 될 것 같습니다.

- 현재 RDS 스냅샷 기반 플로우에서 사용하지 않는 set-load-test.sh를 제거하도록 수정 - k6 동기화 목록과 README에서 레거시 스크립트 참조를 제거하도록 수정

Hexeong

고생하셨습니다! 궁금한점 코멘트 남겨놓습니다!

Hexeong · 2026-05-29T06:39:55Z

+2. load-generator EC2의 SSM agent가 online 상태가 될 때까지 기다립니다.
+3. SSM RunCommand로 k6 파일을 load-generator EC2에 동기화합니다.
+4. k6 binary가 없으면 `set_up_xk6.sh`로 Prometheus remote-write 지원이 포함된 k6를 빌드합니다.
+5. load-generator EC2에서 `whole-user-flow.js`를 실행합니다.


작성해주신 readme.md 파일 잘 읽었습니다! 궁금한 점이 하나 있는데 EC2를 실행하고, k6 스크립트를 실행하는 흐름이라고 이해했습니다. 이때 부하 생성용 EC2까지 자동으로 제거되는 흐름인지 궁금합니다!

또한, 현재 terraform loadtest plan 결과에서 rds가 생성된다고 적혀있는데, terraform apply 시에 실제 생성될 것 같은데 이부분은 어떻게 돌아가는건지도 궁금합니다!

질문 감사합니다. load-generator EC2는 Start 단계에서 Terraform apply로 생성되고, Run 단계에서는 기존 EC2에 SSM RunCommand로 k6만 실행합니다. 자동 제거는 Run 직후가 아니라 Load Test Stop workflow에서 destroy_rds=true로 실행할 때 Terraform destroy로 load-test RDS와 함께 제거되는 흐름입니다.

plan에 RDS 생성이 보이는 부분도 동일하게, plan은 생성 예정 리소스를 보여주는 것이고 실제 생성은 Load Test Start workflow의 Terraform apply 시점에 발생합니다. 이때 prod RDS 최신 자동 스냅샷에서 load-test RDS를 복원합니다.

저 그럼 궁금한게 rds와 같은 환경의 경우, 부하테스트 결과로 발견한 문제에 대해 분석을 위해 부하테스트 이후에도 켜져 있을 필요가 있을 수 있다고 생각합니다.
반면에 ec2는 부하 생성한 이후에 ec2 자체의 상태를 트래킹/분석할 필요는 없다고 생각해서 ec2 인스턴스의 생명주기를 loadtest run 깃헙 액션의 생명주기와 같이 가도 좋을 것 같다고 생각하는데 이 부분에 대한 의견 부탁드립니다!

만약에 run에서 destory를 한다고 하면 빠르게 부하테스트를 다시 진행할 때 대기시간이 오래걸립니다. 근데 그렇다고 ec2를 꺼져있는 상태로 유지하면 ebs 비용이 계속 발생합니다. 생명주기를 같이 하는 건 좋아보이는데 만약 그렇게 한다고 하면 부하테스트를 빠르게 다시 진행해야 할 경우에는 어떻게 하는 게 좋을까요?

run에서 destory 진행하는데 옵션으로 빠르게 다시 부하테스트를 진행할 경우 input값으로 destory를 하지 않도록 설정할 수 있게 진행하겠습니다

회의 결과) loadtest run 액션 실행시 destroy 여부 값을 넘겨주는 방식으로 결정되었습니다. 이미 destroy되었다면 loadtest stop에서는 해당 ec2에 대해서는 문제 없이 넘어가도록 하기로 했습니다.

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (1)

environment/load_test/variables.tf (1)
126-130: 💤 Low value

Prometheus remote-write 기본값이 평문 HTTP + 하드코딩 IP입니다

http://132.145.83.182:9090/...로 메트릭이 평문 전송되며 IP가 코드에 고정되어 있습니다. 기능상 문제는 아니지만, 엔드포인트 변경 시 유연성과 전송 구간 보안을 위해 변수/시크릿으로 분리하거나 TLS 적용을 고려해 주세요.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@environment/load_test/variables.tf` around lines 126 - 130, The terraform
variable k6_prometheus_remote_write_url currently hardcodes a plaintext HTTP IP;
update it to avoid a hardcoded plaintext endpoint by removing the fixed default
and/or switching the default to a secure placeholder (e.g., empty string or an
HTTPS URL), and document that the real value should be provided via
environment/terraform variable or a secrets manager; specifically edit the
variable "k6_prometheus_remote_write_url" to not embed the IP, prefer
https://... if known, and mark consumption points to validate non-empty/secure
scheme at runtime.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/load-test-run.yml:
- Around line 68-84: The workflow is vulnerable to shell injection by
interpolating inputs directly into the run block; stop expanding `${{ inputs.*
}}` inside the bash script arguments and instead pass all user inputs via the
job/env mapping (e.g., set environment variables like VUS, ITERATIONS,
MAX_DURATION, TARGET_BASE_URL, PROMETHEUS_REMOTE_WRITE_URL) then update the run
step that builds the args array and calls bash scripts/load_test/run_k6.sh to
reference those env vars (e.g., use "$VUS", "$ITERATIONS", etc. within the args
array and only append optional flags when the corresponding env var is
non-empty) so no template values are injected into raw shell code.

In `@config/load-test/k6/set_up_xk6.sh`:
- Around line 42-48: The script uses unpinned module versions which can break on
older Go (e.g., Go 1.22.2); update the install/build commands to pin compatible
releases instead of using `@latest`: replace go install
go.k6.io/xk6/cmd/xk6@latest and the "$XK6_BIN" build --with
github.com/grafana/xk6-output-prometheus-remote@latest with specific, tested
version strings (for example a known xk6 release compatible with Go 1.22.2 and a
matching xk6-output-prometheus-remote tag), verify compatibility locally, and
ensure the echo/--help check using the XK6_BIN variable still runs after
pinning.

In `@environment/load_test/main.tf`:
- Around line 123-151: The aws_instance "load_generator" allows IMDSv1 by
default; add a metadata_options block to the resource to force IMDSv2 by setting
http_tokens = "required" (and optionally http_endpoint = "enabled" and
http_put_response_hop_limit = 1) within the aws_instance "load_generator"
resource so the EC2 instance profile credentials exposed in user_data/SSM
contexts cannot be retrieved via IMDSv1.

In `@environment/load_test/output.tf`:
- Around line 51-59: 해당 출력값 output "load_test_db_username_parameter_name" 및
output "load_test_db_password_parameter_name"은 scripts/ 및 .github/에서 소비되는 참조가 없는
deprecated 항목으로 보이므로 사용처가 없다면 두 output 블록을 제거하고 관련 변수
var.load_test_db_username_parameter_name 및
var.load_test_db_password_parameter_name 도 정리하세요; 만약 환경상 여전히 출력이 필요하다면 직접 null을
노출하지 않도록 output 값을 var.prod_db_username_parameter_name /
var.prod_db_password_parameter_name 등 운영용 변수로 대체하여 null 반환을 방지하도록 업데이트하세요.

In `@scripts/load_test/run_k6.sh`:
- Around line 195-212: The k6 command in run_commands_json currently injects
K6_PROMETHEUS_RW_SERVER_URL but doesn't enable the experimental-prometheus-rw
output, so remote-write won't run; change the command construction to
conditionally append the output flag when prometheus_url is non-empty (e.g.,
detect PROMETHEUS_REMOTE_WRITE_URL / the $prometheus_url jq arg and, if not
empty, add "-o experimental-prometheus-rw" (or set
K6_OUT=experimental-prometheus-rw) to the sudo -u ubuntu -H ... ./k6 run
invocation that uses K6_SCRIPT and k6_dir), ensuring the flag is only present
when prometheus_url is provided.

---

Nitpick comments:
In `@environment/load_test/variables.tf`:
- Around line 126-130: The terraform variable k6_prometheus_remote_write_url
currently hardcodes a plaintext HTTP IP; update it to avoid a hardcoded
plaintext endpoint by removing the fixed default and/or switching the default to
a secure placeholder (e.g., empty string or an HTTPS URL), and document that the
real value should be provided via environment/terraform variable or a secrets
manager; specifically edit the variable "k6_prometheus_remote_write_url" to not
embed the IP, prefer https://... if known, and mark consumption points to
validate non-empty/secure scheme at runtime.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a3714734-9b68-4a5c-baeb-7f2f3377cd51

📥 Commits

Reviewing files that changed from the base of the PR and between 9ef254a and ab1135c.

📒 Files selected for processing (16)

.github/workflows/load-test-run.yml
.github/workflows/load-test-start.yml
.github/workflows/load-test-stop.yml
config/load-test/k6/set_up_xk6.sh
config/load-test/k6/whole-user-flow.js
config/secrets
environment/load_test/main.tf
environment/load_test/output.tf
environment/load_test/variables.tf
environment/stage/main.tf
modules/app_stack/ec2.tf
modules/app_stack/variables.tf
scripts/load_test/README.md
scripts/load_test/run_k6.sh
scripts/load_test/start.sh
scripts/load_test/stop.sh

💤 Files with no reviewable changes (3)

modules/app_stack/variables.tf
modules/app_stack/ec2.tf
environment/stage/main.tf

✅ Files skipped from review due to trivial changes (2)

config/secrets
scripts/load_test/README.md

🚧 Files skipped from review as they are similar to previous changes (3)

.github/workflows/load-test-stop.yml
scripts/load_test/stop.sh
config/load-test/k6/whole-user-flow.js

- workflow 입력값을 env로 전달해 shell injection 위험을 줄이도록 수정 - load-generator EC2에 IMDSv2를 강제하도록 수정 - k6 remote-write 출력과 xk6 버전 고정을 추가하도록 수정 - deprecated output과 Prometheus URL 하드코딩을 제거하도록 수정

lsy1307 added 3 commits May 6, 2026 01:07

chore: 부하 테스트 secret 포인터 갱신

a253665

- 상세내용: 부하 테스트 실행에 필요한 secret submodule 변경 커밋을 상위 인프라 저장소에 반영

feat: 부하 테스트 RDS 인프라 구성

d7456ec

- 상세내용: 부하 테스트용 RDS, 보안 그룹, SSM datasource 파라미터를 Terraform으로 정의 - 상세내용: prod/stage EC2 보안 그룹에서 loadtest RDS 3306 접근을 허용하도록 구성

lsy1307 requested review from Gyuhyeok99, Hexeong, JAEHEE25, sukangpunch, whqtker and wibaek as code owners May 5, 2026 16:18

lsy1307 added 2 commits May 6, 2026 01:47

Hexeong reviewed May 6, 2026

View reviewed changes

coderabbitai Bot reviewed May 6, 2026

View reviewed changes

lsy1307 added 5 commits May 6, 2026 17:04

chore: 부하 테스트 plan 자동 검증 제외

4878107

- 상세내용: 임시로 생성되는 load_test 환경을 PR Terraform plan 대상에서 제외했습니다. - 상세내용: load_test apply와 destroy는 수동 GitHub Actions workflow에서만 실행하도록 정리했습니다.

Hexeong assigned lsy1307 May 28, 2026

Hexeong added the 인프라 label May 28, 2026

lsy1307 added 3 commits May 29, 2026 01:51

feat: k6 응답 검증 보강

628043a

- HTTP 실패와 JSON 파싱 실패를 명시적으로 처리하도록 수정 - 빈 배열과 누락된 id를 역참조 전에 검증하도록 추가

feat: 부하 테스트 시크릿 참조 갱신

f6e3174

- 스냅샷 복원 방식에 맞춰 갱신된 secrets submodule 커밋을 참조하도록 수정

lsy1307 force-pushed the 19-feat-loadtest-rds-parameter-store branch from 7e25661 to f6e3174 Compare May 28, 2026 16:51

feat: 레거시 부하 테스트 스크립트 제거

ab1135c

- 현재 RDS 스냅샷 기반 플로우에서 사용하지 않는 set-load-test.sh를 제거하도록 수정 - k6 동기화 목록과 README에서 레거시 스크립트 참조를 제거하도록 수정

Hexeong reviewed May 29, 2026

View reviewed changes

coderabbitai Bot reviewed May 29, 2026

View reviewed changes

Comment thread .github/workflows/load-test-run.yml

Comment thread config/load-test/k6/set_up_xk6.sh Outdated

Comment thread environment/load_test/main.tf

Comment thread environment/load_test/output.tf Outdated

Comment thread scripts/load_test/run_k6.sh

lsy1307 added 2 commits May 29, 2026 16:09

feat: 부하 생성기 실행 후 정리 옵션 추가

23a3e57

Conversation

lsy1307 commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

변경 내용

현재 플로우

테스트

Uh oh!

coderabbitai Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

❌ Failed checks (1 warning, 2 inconclusive)

Uh oh!

github-actions Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Terraform Plan: stage

Uh oh!

github-actions Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Terraform Plan: prod

Uh oh!

github-actions Bot commented May 5, 2026

Uh oh!

Hexeong left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 6, 2026

Terraform Plan: load_test

Uh oh!

lsy1307 commented May 28, 2026

Uh oh!

Hexeong left a comment

Choose a reason for hiding this comment

Uh oh!

Hexeong May 29, 2026

Choose a reason for hiding this comment

Uh oh!

Hexeong May 29, 2026

Choose a reason for hiding this comment

Uh oh!

lsy1307 May 29, 2026

Choose a reason for hiding this comment

Uh oh!

Hexeong May 30, 2026

Choose a reason for hiding this comment

Uh oh!

lsy1307 May 30, 2026

Choose a reason for hiding this comment

Uh oh!

lsy1307 May 30, 2026

Choose a reason for hiding this comment

Uh oh!

Hexeong May 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

lsy1307 commented May 5, 2026 •

edited

Loading

coderabbitai Bot commented May 5, 2026 •

edited

Loading

github-actions Bot commented May 5, 2026 •

edited

Loading

Terraform Plan: `stage`

github-actions Bot commented May 5, 2026 •

edited

Loading

Terraform Plan: `prod`

Hexeong left a comment •

edited

Loading

Terraform Plan: `load_test`