Launch a Debugging Terminal into GitHub Actions

(blog.gripdev.xyz)

155 points | by martinpeck 26 days ago

14 comments

embedding-shape 26 days ago
That the entire ecosystem seems to have moved to GitHub Actions is such a loss for productivity. I remember when CircleCI first launched, and you could "Rebuild with SSH" which gave you a bash command to connect to the running instance whenever you wanted, was such a no-brainer, and I'm sure why many of us ended up using CircleCI for years. Eventually CircleCI became too expensive, but I still thought that if other services learnt anything from CircleCI, it would be this single feature, because of the amount of hours it saved thousands of developers.
Lo and behold, when GitHub Actions first launched, that feature was nowhere to be seen, and I knew from that moment on that betting on GitHub Actions would be a mistake, if they didn't launch with such a table-stakes feature. Seems still Microsoft didn't get their thumb out, and wasting countless developer's time with this, sad state of affairs.
Thank you pbiggar for the time we got with CircleCI :) Here's to hoping we'll have CircleCI.V2 appearing at some point in the future, I just know it involves DAGs and "Rebuild with SSH" somehow :)
[-]
- olafmol 13 days ago
  We (CircleCI) are still there, and doing just fine :) Out of interest, what are you currently missing and what would those "essential" V2 features be? tnx for sharing your thoughts!
- melezhik 21 days ago
  You can use http://deadsimpleci.sparrowhub.io it allows to debug ci locally , as under the hood this is just a docker and your scripts ( Python, bash, whatever ), no magic, , the project is in active development and I am open for feedback
- kevmo314 26 days ago
  I am surprised Docker didn't launch into the CI market. Running a container build as CI seems like it would both be a boon for simplifying CI caching and also debugging since it's ~reproducible locally.
  [-]
  - hobofan 26 days ago
    They _are_ in the CI market. Two of their products are the Docker Build Cloud and Testcontainers Cloud. IIRC Docker Hub also came with automated builds at some point (not sure if it still does).
    I do get your sentiment tough. For the position they are in, a CircleCI-like product would seem to be quite fitting.
    [-]
    - kevmo314 26 days ago
      Wow you're right they are. Yeah, they could really use some improvement there.
      https://docs.docker.com/build-cloud/ci/
      This could've been a "change runs-on to be this" like all the other faster GHA startup products, but instead the way they set it up I would have to keep paying for GHA while also paying for their build cloud. No fun!
- fyhn 26 days ago
  I've gotten used to this essential feature too via Semaphore CI, and I just can't stand not being able to SSH into a GitHub Action. Debugging is so slow.
  [-]
  - embedding-shape 26 days ago
    I've seen people spend something like 2 hours fixing something that can be fixed in minutes if you had a normal feedback cycle instead of the 5 minute "change > commit > push > wait > see results" feedback cycle GitHub Action forces people into. It's baffling until you realize Microsoft charges per usage, so why fix it? I guess the baffling part is how developers put up with it anyways.
    [-]
    - Storment33 26 days ago
      Does not sound like a GitHub failure, sounds it is the company's failure. They haven't invested in the developer experience and they have developers who cannot run stuff locally and are having to push to CI in order to get feedback.
      [-]
      - IshKebab 26 days ago
        You can't run a GitHub CI pipeline locally (in general; there are some projects to try but they're limited). Even if you make as much of it runnable locally as possible (which you should) you're inevitably going to end up debugging some stuff by making commits and pushing them. Release automation. Test reporting. Artifact upload. Pipeline triggers. Permissions.
        Count yourself lucky you've never had to deal with any of that!
        [-]
        Storment33 26 days ago
        Yes there are a few things you can't do locally. But the vast majority of complaints I see 90%+ are for builds/tests etc that should have the same local feedback loops. CI shouldn't be anything special, it should be a 'shell as a service' with some privileged credentials for pushing artefacts.
        > Release automation. Test reporting. Artifact upload.
        Those I can actually all do locally for my open source projects on GitHub, if I the correct credentials in my env. It is all automated(which I developed/tested locally) but I can break glass if needed.
        [-]
        IshKebab 25 days ago
        > Those I can actually all do locally for my open source projects on GitHub
        Maybe I wasn't clear enough in my description, but you definitely can't locally do things like automatically creating a release in a Github workflow, sending test results as a comment to PRs automatically and uploading CI pipeline artifacts locally. Those all intrinsically require running in Github CI.
        [-]
        Storment33 25 days ago
        I agree there is stuff you can't test locally, but in my experience people most of the time are complaining about stuff they should have local feedback loops for such as compiling, testing, end to end testing etc.
        You give some good examples and I agree they is CI specific stuff that can only be really tested on CI, but it a subset of what I generally see people complaining about.
        > can't locally do things like automatically creating a release in a Github workflow, sending test results as a comment to PRs automatically and uploading CI pipeline artifacts locally.
        > uploading CI pipeline artifacts locally
        I actually testing this locally before opening up a pull request to add it. I just have my workflow call out to a make target, so I can do the same locally if I have the right credentials using the same make target.
        E.g. this workflow trigger on a release.
```yaml name: Continuous Delivery (CD)
on: release: types: [published]
# https://docs.github.com/en/actions/using-jobs/assigning-perm... permissions: contents: write packages: write
jobs: publish-binary: name: Publish Binary runs-on: ${{ matrix.architecture }} strategy: matrix: architecture: [ubuntu-24.04, ubuntu-24.04-arm] steps: - name: Checkout code. uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1 - name: Setup Nix. uses: cachix/install-nix-action@4e002c8ec80594ecd40e759629461e26c8abed15 # v31.9.0 - name: Publish binary. run: nix develop -c make publish-binary RELEASE="${GITHUB_REF_NAME}" env: GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} # This token is provided by GitHub Actions. ```
Which after building the binary calls this script
```bash #!/usr/bin/env sh
        set -o errexit set -o xtrace
        if [ "$#" -ne 2 ]; then echo "Usage: $0 RELEASE_TAG TARGET" echo "$#" exit 1 fi
        RELEASE="$1" TARGET="$2"
        tar -czvf "${TARGET}.tar.gz" -C "target/${TARGET}/release" "clean_git_history" gh release upload "${RELEASE}" "${TARGET}.tar.gz" rm "${TARGET}.tar.gz" ```
        So I was able to test large parts of this locally first via `make publish-binary RELEASE="test-release"`.
      - embedding-shape 26 days ago
        Can't do much about that when there is something you're troubleshooting about the CI platform itself. Say you're troubleshooting why the deployment doesn't work, somehow got the environment variable wrong for whatever reason. So you edit and add a "env | sort" before that, commit it, push it, so on. With "rebuilt with ssh", you literally are inside the "job" as it runs.
        [-]
        Storment33 26 days ago
        Yes you can't really debug CI specific stuff locally, like if your setting up build caching or something. But it seems like 90%+ of the time people are complaining about builds/tests that should have local feedback loops.
        [-]
        embedding-shape 26 days ago
        Yeah, fair point, I see that a lot in the wild too. I guess I kind of assumed we all here had internalized the practice of isolating everything into one command that runs remotely, like "make test" or whatever, rather than what some people do and put entire shellscripts-but-yaml in their pipeline configs.
        [-]
        Storment33 26 days ago
        Yeah everytime I see logic in YAML I cringe. Trying at work to get people to use a task runner or even call out to scripts was a fight...
- ljm 26 days ago
  Still using CircleCI. I do not love YAML at all, in fact I hate it because it's basically a 1980s text preprocessor on steroids and with dependency management. Too much logic applied to config that depends on implicit syntax and unintuitive significant whitespace.
  I mean, I had an issue once where this broke the pipeline:
```
   key:
     - value 1
     - value 2
```
  But this was fine:
```
    key:
    - value 1
    - value 2
```
  Fuck that noise!
  Otherwise it works just as good as it ever did and I don't miss Github Actions where every pipeline step is packaged into a dependency. I think Github has stagnated harder than CircleCI.
  [-]
  - woodruffw 26 days ago
    > I mean, I had an issue once where this broke the pipeline:
    It seems fair to dislike YAML (I dislike it too), but I don't understand how this broke for you unless CircleCI (or whoever) isn't actually using a legal YAML parser.
```
    irb(main):009:0> YAML.load <<EOD
    irb(main):010:0" key:
    irb(main):011:0"  - value 1
    irb(main):012:0"  - value 2
    irb(main):013:0" EOD
    => {"key"=>["value 1", "value 2"]}
    irb(main):014:0> YAML.load <<EOD
    irb(main):015:0" key:
    irb(main):016:0" - value 1
    irb(main):017:0" - value 2
    irb(main):018:0" EOD
    => {"key"=>["value 1", "value 2"]}
```
    (This works for any number of leading spaces, so long as the spacing is consistent.)
  - jborean93 26 days ago
    There shouldn't be any difference between those two values. I'm not saying you are wrong and it didn't break but it's definitely surprising a parser would choke on that vs YAML itself being the problem.
    Don't get me wrong I can empathise with whitespace formatting being annoying and having both forms be valid just adds confusion it's just surprising to see this was the problem.
stabbles 26 days ago
I'm using tmate for this: https://github.com/mxschmitt/action-tmate
[-]
- efrecon 26 days ago
  I have written https://github.com/efrecon/sshd-cloudflared to solve the same problem. It provides you with an SSH connection inside a transient cloudflare tunnel. The connection is only accessible to the SSH public keys stored in your GitHub account.
- Etheryte 26 days ago
  This is the only reasonable way to ever do this, requires no effort, just copy paste one of the examples and you're done. My only gripe is that the most secure option isn't the first example in the repo. Limit access to the actor and put it behind the debug only flag and you're good to go. Still, I remove it after the fact once I don't need it anymore since it feels a bit too sketch with secrets available.
- gorjusborg 26 days ago
  I'll second this.
  I've used this action to debug builds, and it works beautifully.
  However, I've had to stop because the action isn't a 'verified' action and corporate policy.
  I'd love to see github themselves offer something like this.
  [-]
  - SamuelAdams 26 days ago
    The neat part is you can do whatever you want in a GitHub action, corporate policy be damned. So:
    git clone <tmate / banned action git URL> cd <the action> Run the action start point.
    Apparently this is a feature, not a security risk.
    https://blog.yossarian.net/2025/06/11/github-actions-policie...
- theK 26 days ago
  tmate.io returns a 503. Hugged to death by your comment?
cyberax 26 days ago
I solved it by adding a simple Tailscale action to handle failure. It creates an ephemeral instance and waits for connections for 3 minutes. Then it loops while there's an active SSH session present.
It's that simple: https://gist.github.com/Cyberax/9edbde51380bf7e1b298245464a2... and it saved me _hours_ of debug time.
I've moved all my CI/CD to use Taskfiles inside a Docker container since then, so my local environment can replicate the CI/CD environment up to the GITHUB_TOKEN. Still, being able to poke around Github builders is great.
[-]
- lioeters 26 days ago
  That looks like a useful trick, using an ephemeral instance to SSH into a failed CI action context. I see in the script how it waits and checks for root user login, but to keep it alive, this part:
  > Then it loops while there's an active SSH session present.
  From what I can see, the loop stops when a user is logged in. Is this handled elsewhere?
  > use Taskfiles inside a Docker container since then, so my local environment can replicate the CI/CD environment
  Oh this is what I've been wanting, a vendor-neutral way to run the same CI actions locally. I'd seen go-task before, will try it, thanks for the info!
  [-]
  - cyberax 26 days ago
    > That looks like a useful trick, using an ephemeral instance to SSH into a failed CI action context.
    Yup. And Tailscale even manages the SSH key provisioning.
    > From what I can see, the loop stops when a user is logged in. Is this handled elsewhere?
    The script does handle it. The `pgrep` succeeds (returns zero exit code) if there's a "login" process for user 'root' present, which is created when there's an active SSH session. If pgrep fails, then `break` runs and exits the loop.
    Github then terminates the workflow and releases the runner.
    [-]
    - lioeters 26 days ago
      Ah I see what you mean, the loop keeps it alive until login is detected, and after that the machine is kept alive by the SSH session itself. Appreciated.
- rurban 24 days ago
  You also got the Tesla keys, nice!
dreslan 26 days ago
I love this use of hole punching, also love how the author handled authentication.
I have definitely been in the position of needing to tweak a workflow over and over to get it to work, wasting hours when a terminal into the action would have allowed me to close the loop in minutes. Nice work to the author!
lawrencegripper 26 days ago
Author here, this was something I wrote for fun/because I wanted to use it. Happy to answer any questions
[-]
- Imustaskforhelp 26 days ago
  This is really awesome and I might try it (definitely bookmarked)
  This might seem (offtopic?) but you mention railway and how for a 20mb app the costs become almost negligible and I got curious because I usually refer hetzner to be one of the cheapest but still good/well worthy solution
  I find the pricing model of railway the most interesting. I am curious if you know of any other alternatives to railway which follow a similar pricing model as well as I'd like to compare if there are more of such cloud providers which provide this (preferably from a service which is more closer to bare metal than y'know cloud providers perhaps if that makes sense)
  [-]
  - lawrencegripper 26 days ago
    Thanks! I'm not aware of others offering this pricing model
t_tsonev 26 days ago
Why SSH to the build agent when you can run your actions locally using the excellent https://github.com/nektos/act
[-]
- apwheele 25 days ago
  I only have pretty tame actions workflows and I have had a hard time replicating simple set ups with this. I can't imagine a company with more complicated setups.
  What I wish is github codespaces could just do this out of the box, at least for a specific action/runner.
- hole_in_foot 26 days ago
  [dead]
whynotmaybe 26 days ago
That's my hill to die on : you must have a self hosted agent.
You can have many cloud agents as you wish but you must at least have one where you can remotely connect.
It has saved me hours of troubleshooting and polluting "workflow v1.3.56_final_should_work_2" commits
[-]
- maxloh 26 days ago
  > That's my hill to die on : you must have a self hosted agent.
  That’s only true if you’re building simple workflows.
  A counter-example would be a workflow that builds and uploads Android APKs. When I last checked last year, there weren't any well-maintained Docker images with the Android SDK pre-installed, and there are no updated, publicly available builds for the runner-images: https://github.com/actions/runner-images/issues/176
  [-]
  - whynotmaybe 26 days ago
    I'm building and deploying appbundle from my self hosted runner for this exact reason.
    I manually maintain flutter and Android sdk on my server.
    I've never been a docker fan, I prefer to completely handle my whole stack.
    I have scripts to install the required tools and some actions in my scripts are just echoing what needs to be done manually.
    With the years, I've found that infra for fully reproducible builds cost too much to maintain for us.
  - esafak 26 days ago
    I do not follow. How does that change anything? Don't things still go wrong? Do you not need to debug?
    [-]
    - maxloh 26 days ago
      Sorry for not being clearly enough.
      The point is that it is very difficult to replicate the environment of a hosted GitHub Actions runner, and having to do so defeats the ease of use the platform provides.
- nwellinghoff 26 days ago
  Agreed. So much easier with self hosted runner. Just get out of your own way and do it. Use cases like caching etc also much more efficient on self hosted runner.
- flanked-evergl 26 days ago
  This kind of misses the point, though. I would say a much better rule is whatever runs in your workflows should also be entirely reproducible locally.
  Even if you can ssh into the remote environment that does not cover things like authentication and authorization, you don't just git a GITHUB_TOKEN with the same permissions.
  [-]
  - Storment33 26 days ago
    Exactly, you should be able to do everything locally! All this needing to SSH into runners or needing self-hosted runners or needing act to emulate GitHub Actions is really a failure of the developer experience.
    [-]
    - whynotmaybe 26 days ago
      A lot of stuff can be handled by developer themselves, but usually some steps are voluntarily blocked, like publishing to Google Play/App store.
      You don't want anyone to be able to publish public facing app from their version of the code that might not be committed.
      Some of us remember an era where deployment was copy-paste from the local /bin folder to the /bin folder on production server.
      [-]
      - Storment33 26 days ago
        While I get some stuff you can't test locally, like 90%+ of complaints I see are for builds/tests. Which is really a failure of the engineers for not having a local feedback loop.
        I am of the opinion you should be able to deploy from your machine, just you do not have the permissions to normally. So that if CI ever goes down and you need to push an emergency fix or something you can break glass if needed.
        [-]
        array_key_first 26 days ago
        If you cannot build and run the application locally, I think there is something seriously, seriously wrong at the company. 90% of my day involves sitting in PHP storm with a debugger attached, introspecting whatever I need to. If I had to rely on even print statements being shit out on someone else's machine I don't know that I could be productive.
        [-]
        Storment33 25 days ago
        I agree, yet unfortunately most of the time I personally see people complaining it is about builds or tests unable to reproduce failures etc locally or unable to run end to end tests and have to push to CI to get them ran.
axm__ 26 days ago
I was looking at frp for this. Setup is a bit more involved but you don't need a browser terminal: https://github.com/rgl/frp-github-actions-reverse-shell
baby_souffle 26 days ago
There are many tools and techniques like this. Not a nock against this tool, just an observation that we seemingly need these tools.
Is there no better way, GitHub?
[-]
- embedding-shape 26 days ago
  > Is there no better way, GitHub?
  CircleCI solved this anno 2011, with "Rebuild with SSH". Microsoft asleep at the wheel as usual, not sure it's unexpected at this point.
  [-]
  - bathtub365 26 days ago
    The more you have to rerun your actions to debug them, the more money Microsoft makes. They aren’t incentivized to save you time.
    [-]
    - embedding-shape 26 days ago
      Completely bonkers that people, companies and organizations just swallow this, bait and all.
      [-]
      - esafak 26 days ago
        Free hosting, CI minutes, and an ecosystem.
        [-]
        embedding-shape 26 days ago
        Commit and push to test small incremental changes, self-hosted runners' time still count towards CI minutes, and an ecosystem hellbent on presenting security holes as new features. I'm a bit unimpressed :)
- esafak 26 days ago
  Dagger. Workflows that run anywhere, including locally.
  [-]
  - Storment33 26 days ago
    I've seen dagger pipelines they're horrendous. Just have GitHub Actions call out to a task runner like Make/Taskfile etc and use an environment manager Mise or Nix to install all the tools.
    [-]
    - esafak 26 days ago
      I think that is a good pattern too, though I would replace the make/taskfile step with something bazel-like.
      Dagger used to be more declarative with CUE, but demand was not strong enough.
franktankbank 26 days ago
When I see stuff like this, I think wow that is cool. But then I think about doing it myself and I get nervous about security ramifications. I don't know enough myself to know if author knows the right way ya know??
theknarf 26 days ago
I remember when https://sshx.io/ first launched for this use case
x0rg 26 days ago
Wow that's great, I'm definitely going to try it. This guy knows what he is doing.
stets 26 days ago
I want this for Gitlab so badly
[-]
- Mogzol 26 days ago
  GitLab already has "interactive web terminals" which is basically the same thing: https://docs.gitlab.com/ci/interactive_web_terminal/
msie 26 days ago
I gave GH actions a chance when our org moved from Bamboo but I still hate it. I think i have to do more to get a build going.