Startup owner using AI with this need - needless to say, a real problem. I've considered DIYing an internal service for this - even if we went with you we'd probably have an intern do a quick and dirty copy, which I rarely advocate for if I can offload to SAAS. I'm sure you've put a fair bit of work into this that goes well beyond the human interaction loop, but that's really all we need. Your entry price is steep (I'm afraid to ask what an enterprise use-case looks like) and this isn't complicated to make. We don't need to productize or have all the bells and whistles - just human interaction occasionally. Any amount of competition would wipe out your pricing, so no I would not want to pay for this.
thanks for the validation of the problem! totally open to feedback about the solution, and totally get that you only need something simple for now. I want to point out that we do have a pay-as-you-go tier which is $20 for 200 operations, and have a handful of indie devs finding this useful for back-office style automations.
ALSO - something I think about a lot - if a all/most of the HumanLayer SaaS backend was open source, would that change your thinking?
My gut feeling is with where we're headed we'll clear that 200 pretty quickly in production cases, so we'd be interested in bit higher volume. Our dev efforts would probably clear that 200/mo. If the flow/backend was open-source that'd be a total game changer for us as I see it as an integral part of our product.
edit: I want to add here that while ycomb companies like yourself may have VC backing, a lot of us don't and do consider 500+/mo. base price on a service that is operations-limited to be a lot. You need to decide who your target audience is, I may not be in that audience for your SAAS pricing. This seems like a service that a lot of people need, but it also stands out to me as a service that will be copied at an extravagantly lower price. We have truly entered software as a commodity when I, a non-AI engineer, can whip up something like this in a week using serverless infra and $0.0001/1k tokens with gpt-o mini.
that makes sense - and have wondered a lot even more generally about the price of software and what makes a hard problem hard. Like Amjad from Replit said on a podcast recently "can anyone build 'the next salesforce' in a world where anyone can build their own salesforce with coding agents and serverless infra"
I think in building this some of the things that folks decided they don't want to deal with is like, the state machine for escalations/routing/timeouts, and infrastructure to catch inbound emails and turn them into webhooks, or stitch a single agent's context window with multiple open slack threads, but you're right, that can all be solved by a software engineer with enough time and interest in solving the problem.
I will need to clear up the pricing page as it sounds like I didn't do a good job (great feedback thank you!) - it's basically $20/200 credits, and you can pay-as-you-go, and re-up for more whenever you want. We are early and delivering value is more important to me than extracting every dollar, especially out of a fellow founder who's early. If you geniunely find this useful, I would definitely chat and collaborate/partner to figure out something you think is fair, where you're getting value and you get to focus on your core competency. feel free to email me dexter at humanlayer dot dev
I’m just armchair quarterbacking here but I feel like you should just do all features for every user with a single $/action rate, then give discounts for volume and/or prepayment. Even saying $20/200 is a clunky statement. You could just say $0.10 per action (the fact that you’re actually requiring me to make a $20 payment with a $20 charge once it gets to $10 or something like that isn’t even important to me on a pricing page, although when you mention it later in the billing page make sure you also tell people it’s a risk-free money back guarantee if that’s the case)
If there’s something that truly has an incremental cost to you, like providing priority support, that goes into the “enterprise pricing” section and you need to figure out how to quote that separately from the service. My guess is most people don’t want to pay extra for that, or perhaps they’d pay for some upfront integration support but ongoing support is not too important to them. Idk, that’s just my guess here.
Big systems like Salesforce started as small things that more deeply learned about and more deeply understood unmet demand and customer needs, and then got to packaging it in a way to create something that grows.
Coding agents can help more with tasks and not quite big entire massive platforms on their own. Humans may be able to scale much further and bigger with their skills.
My feedback: what’s there looks inviting. Email interaction is handy, other ways would be too.
If there was a low code way to arrange the humanlayer primitives for folks at the edge of using it, I think human tasks could meet something like this even broader. Happy to chat offline.
Onto your comment: The coding for coding agents is still kinda prototype. It feels like some folks quietly have setup a very productive workflow themselves for quite sometime.
Still, there no doubt you could ship production code in some cases - except ai needs to handle all the things development explicitly and implicitly checks before doing so.
Getting to build some things that became more than few orders of magnitude larger than planned, one can learn a lot from the deep experiences of others… and I’m not sure where that is in AI. Speaking to someone with experience and insight can provide some profound insight, clarification and simplification.
Still, an axiom for me remains: clever architecture still tends to beat clever coding.
The best code often is the code that’s not written and not maintained and hopefully the functionality can be achieved through interacting with the architecture.
This approach is only one way, but it can take both domain knowledge and data knowledge, to put in enough a domain and data driven design relative to how well the developer may know the required and anticipated needs.
The high end of software development is many leagues beyond even what I just described. There’s a lot of talk about 10x engineers, I’d say there can be developers who definitely can be 10x as effective or reach 10x more of the solution, than average.
If a lot of the code AI is modelled on is based on the body of code in repos, most on a wide scale may be average to above average at most, perfectly serviceable and iteratively updated.
Sometimes we see those super elegant designs of fewer tables and code that does near everything, because it’s developments 5th or 6th version creating major overhauls. It could be refactored, or if the schema is not brittle, maybe a rewrite in full or part of the exact same team is present to do it.
Today’s AI could help shed a light in some of those directions, relative to the human using it. This again says in the hands of an expert developer AI can do a ton more for them, but the line to automation might be something else.
There is agentic ai and human in the loop to still figure itself out, as well as how to improve the existing processes. 2025 looks to be interesting.
I think a lot about the low code side and how we can make that work...at the end of the day that looks like a feature/integration into other platforms and that means a lot of matching opinions/interfaces.
I think K8s ecosystem did this well but it required big cross-enterprise working groups that produced things like CSI, SMI, OCI, and before that could happen, there was like 5+ years of storming and forming that had to happen before the dust settled enough for enteprises to step forward and embrace k8s/containers as the new compute paradigm
maybe i'm overthinking things.
onto the coding things -
> clever architecture still tends to beat clever coding
love this
> best code often is the code that’s not written and not maintained
it's too bad not-writing-code isn't as satisfying as deleting code
> it’s developments 5th or 6th version creating major overhauls
yeah the best agent orchestration architecture I'm aware is on interation 4 going on 5. I told him to open source it but he said "its .NET so if I open source it, nobody will care" XD
> There is agentic ai and human in the loop to still figure itself out, as well as how to improve the existing processes. 2025 looks to be interesting.
This might deserve to be the new to-do list everyone learns to build if only that there's so much to learn from trying on how to get it best.. this month or quarter.
I assume your reasoning is something like: if people are already paying out the nose for open AI calls, an extra ten cents to make a human in the loop check probably isn't bad, and realistically speaking ten cents isn't much when compared to a valuable person's time, and I guess the number of calls to your service is likely expected to be fairly low (since they by definition require human intervention) so you need a high per operation cost to make anything.
Even understanding that, the per operation cost seems astronomical and I imagine you'll have a hard time getting people past that knee jerk reaction. Maybe you could do something like offer a large initial number of credits (like a couple hundred), offer some small numbers of free credits per month (like.... ten?) and then have some tier in between free and premium with lower per operation pricing?
It also seems painful that the per operation average of the premium plan is greater than the free offering (when using 2000 ops). Imo you'd probably be better off making it lower than the free offering from 200 ops and up, to give people an incentive to switch. I imagine people on your premium plan using premium features would be more likely to continue to do so, for one. The simplest way to do this would be to bump up the included ops up to 5k I guess. Someone using less than 5k would still have a higher average price, but it seems like it would come off better.
thanks for the feedback, I spend a lot of time thinking about it. right now the premium tier includes features that are much harder to build/maintain and take more to integrate, so we want a bit of a commitment up front, but it does stick out to me that the price/op goes up in that case
we do have 100/mo for free at the free tier (automatic top up).
I think the comparison to how openAI calls are volume based (and rather $$) is a super valid one though and I lean on that a lot
I was wondering: have you thought about automation bias or automation complacency [0]? Sticking with the drop-tables example: if you have an agent that works quite well, the human in the loop will nearly always approve the task. The human will then learn over time that the agent "can be trusted", and will stop reviewing the pings carefully. Hitting the "approve" button will become somewhat automated by the human, and the risky tasks won't be caught by the human anymore.
this is fascinating and resonates with me on a deep level. I'm surprised I haven't stumbled across this yet.
I think we have this problem with all AI systems, e.g. I have let cursor write wrong code from time to time and don't review it at the level I should...we need to solve that for every area of AI. Not a new problem but definitely about to get way more serious
This is something we frequently saw at Uber. I would say it's the same as there's already an established pattern for this for any sort of destructive action.
Intriguingly, it's rather similar to what we see with LLMs - you want to really activate the person's attention rather than have them go off on autopilot; in this case, probably have them type something quite distinct in order to confirm it, to turn their brain on. Of course, you likely want to figure out some mechanism/heuristics, perhaps by determining the cost of a mistake, and using that to set the proper level of approval scrutiny: light (just click), heavy (have to double confirm via some attention-activating user action).
Finally, a third approach would be to make the action undoable - like in many applications (Uber Eats, Gmail, etc.), you can do something but it defers doing it, giving you a chance to undo it. However, I think that causes people more stress, so it’s rather better to just not do that than to confirm and then have the option to undo. It’s better to be very deliberate about what’s a soft confirm and what’s a hard confirm, optimizing for the human in this case by providing them the right balance of high certainty and low stress.
I think the canonical sort of approach here is to make them confirm what they're doing. When you delete a GitHub repo for example, you have to type the name of the repo (even though the UI knows what repo you're trying to delete).
If the table name is SuperImportantTable, you might gloss over that, but if you have to type that out to confirm you're more likely to think about it.
Premature optimization, and premature automation cause a lot of issues, and overlooking a lot of insight.
By just doing something manually 10-100 times, and collecting feedback, both understanding of the problem, possible solutions/specifications can evolve orders of magnitude better.
yeah the people who reach for tools/automation before doing it themself at least 3-10 times drive me crazy.
I think uncle bob or martin fowler said "don't buy a JIRA until you've done it with post-its for 3 months and you know exactly what workflow is best for your team"
Congratulations on launch! We’ve faced this problem with our autonomous web browsing agent https://www.donobu.com and ended up implementing a css overlay to wait for user input in certain cases.
Slack would be so much better. Excited to try humanlayer out.
very cool - come ping us in discord happy to help out - we did do a demo w/ dendrite/stagehand a few weeks back where the AI can pull you into a browserbase session OR just ping you in plaintext to get things like MFA codes etc
Isn't this precisely how AI started? It was a bunch of humans under the hood doing the logic when the companies said it was AI. Then we removed the humans and the quality took a hit. To fix that hit, 3rd party companies are putting humans back in the loop? Isn't that kind of like putting a band-aid on the spot where your arm was just blown off?
yeah it's an interesting point. I can only guess that we didn't do a good enough job of learning from the humans while they were doing their jobs...seems like traditional ML or even LLM tech might be good enough that we can take another pass? Overall the thesis of humanlayer is that you should do all this super gradually, move the needle from 1% AI to 99%+, and have strong SLOs/SLAs around when you pause that needle moving because quality took a hit.
P.S. nobody asked but since you made it this far - the next big problem in this space is fast becoming, what else do we need to be able to build these "headless" or "outer loop" AI agents? Most frameworks do a bad job of handling any tool call that would be asynchronous or long running (imagine an agent calling a tool and having to hang for hours or days while waiting for a response from a human). Rewiring existing frameworks to support this is either hard or impossible, because you have to
1. fire the async request,
2. store the current context window somewhere,
3. catch a webhook,
4. map it back to the original agent/context,
5. append the webhook response to the context window,
6. resume execution with the updated context window.
I have some ideas but I'll save that one for another post :) Thanks again for reading!
Just to frame the problem slightly differently, if you had unlimited number of humans who could perform tasks as quickly as a computer this wouldn't be a problem need solving. So since we know that's the end state for any human-in-the-loop system then maybe it's worth solving that problem instead.
A few things come to mind, divide the problem into chunks that can be solved in parallel by many people. Crowd source your platform so there are always people available with a very high SLA, just like cloud servers are today.
Essentially, you don't need to think about time and space. You just write more or less normal looking code, using the Temporal SDK. Except it actually can resume from arbitrarily long pauses, waiting as long as it needs to for some signal, without any special effort beyond using the SDK. You also automatically get great observability into all running workflows, seeing inputs and outputs at each step, etc.
The cost of this is that you have to be careful in creating new versions of the workflow that are backwards compatible, and it's hard to understand backcompat requirements and easy to mess up. And, there's also additional infra you need, to run the Temporal server. Temporal Cloud isn't cheap at scale but does reduce that burden.
thanks for bringing this up. I just spent 2 hours last night digging into MCP - I'd love to learn more about how you think this solves the HitL problem. From my perspective MCP is more of a protocol for tool calling over the stdio wire, and the only situation it provides HitL is when human is sitting in the desktop app observing the agent synchronously?
Again, genuinely looking to learn - where does MCP fit in for async/headless/ambient agents, beyond a solid protocol for remote tool calls?
the dystopian startups that use bounding boxes to observe workers in a warehouse and give the boss a report on how many breaks they took...they're here
Oh man, the API call for hl.human_as_tool() is a little ominous. Obviously approving a slack interaction is no big deal, but it does have a certain attitude towards humans that doesn't bode well for us...
so what I'm hearing is, if the approval is transparent and the agent doesn't see it, thats cool, but tell the agent "hey use the human as needed" and now we're getting into sci fi territory ?! either way i don't totally disagree
You’re close. It’s not the humans in the loop in standard tasks you need though, it’s human surrogates for AI agents to do jobs they can’t for a variety of reasons (like missing a body or requiring an internet connection).
I have a request for startups for this: “GraggNet: Task Rabbit for AIs
Surrogate humans for AIs to use before robotics are human level”
yeah this is cool. I saw a couple other people posting about this idea. I know some other folks working on "sourcing the humans" or doing a marketplace style thing. Thoughts on things like Payman or Protegee?
I'm considering this for a workflow agent and would be keen to hear thoughts on this process.
We're a medical device company, so we need to do ISO13485 quality assurance processes on changes to software and hardware.
I had already been thinking of using an LLM to help ensure we are surfacing all potential concerns and ensure they are addressed. Partly relying on the LLM, but really as a method to manage the workflow and confirm that our processes are being followed.
Any thoughts on if this might be a good solution? Or other suggestions by other HN users.
This is exciting. I am an architect in a startup that has long valued bringing humans in the loop for the moments when only humans can do the work. The key thing missing between the potential seen in the last couple years of LLM-based fervor and realizing actual value for us has been the notion of control and oversight. So instead, we have built workflows and manual processes in a custom way throughout the business. Happy to discuss privately sometime! (email in profile)
Congrats on the launch! I'll be thinking about this for a while to be sure.
P.S., there is a minor typo on the URL in your BIO.
i think the slack side is easy. I think an AI-optimized email communication channel is a long ways off. I spent weeks throwing things at my monitor figuring out reliable ways to wire DNS+SES+SNS+Lambda+Webhooks+API+Datastore+Async Workers so that everything would "just work" with a few lines of humanlayer sdk.
And what we build still only serves a small subset of use cases (e.g. to support attachments there's a whole other layer of MIMEtype management and routing to put things in S3 and permission them properly)
Anecdotally, I've worked with and on a few enterprise AI apps and haven't seen this functionality in them. The closest thing i can think of is AI coding agents submitting PRs to repos.
yeah in fact coding / PR-based workflows is one of the few areas where I don't really go super deep. GitHub PRs may have their shortcomings, but IMO it is the undisputed best review/approval queue system in existence by a mile.
i would never encourage someone to make an agent that asks human permission before submitting a PR. the PR is the HitL step
disagree with both, unless your AI agents have full root access to all your systems and access to your bank accounts and whatnot, they are at some point interfacing with other systems that have humans involved in them.
glad it resonates. I came at this as a skeptic but also a pragmatist. I wanted deeply to build agents that did big things, but I had very little trust in them, and you see everywhere the internet is littered with terrible gpt-generated comments and bots these days...how do build AI that does a really good job without needing direct constant supervision (which at the end of the day just feels like a waste of time)
So this is an automated foreman for the customer's own employees, like a call center controller? Or does HumanLayer provide the human labor, like Mechanical Turk?
The API contains a "human_as_tool" function. That's so like Marshall Brain's "Manna".
"Machines should think. People should work." Less of a joke every day.
I'm not sure "automated foreman for employees" is right - I always thought about it more like "a human can now manage 10-15 AI 'interns'" and review their work without having to do everything by hand - the AI still serves the human, and "human_as_tool" is a way for AI to ask for help/guidance.
> "Machines should think. People should work." Less of a joke every day.
yes. I agree. a little weird. I forget where I heard this but the other version is "we should get ai/robots to cook and do laundry so we can spend more time writing and making art...feels like we ended up the other way around"
so I played with MCP for a while last night and I think MCP is great as a layer to pull custom tools into the existing claude/desktop/chat experience. But at the end of the day its just basic agentic loop over tool calls.
If you want to tell a model to send a message to slack, sure, give it a slack tool and let it go wild. do you see a way how MCP applies for outer-loop or "headless" agents in a way that's any different from another tool-calling agent like langchain or crewai? IT seems like just another protocol for tool calling over the stdio wire (WHICH, TO BE CLEAR, I FIND SUPER DOPE)
Looking forward to playing with HumanLayer. The slack integration looks a lot more useful for my workflows than other tools I've tried.
In the demo video and example, you show faked LinkIn messages integration. Do you have any recommendations for tools that can actually integrate with live LinkedIn messages?
thanks for sharing your experience so far! Like I said, we built this ourselves for another idea and it was painful.
I have played with Make and I actually chatted w/ the gotohuman guy on zoom a while back, I like his approach as well, he went straight to webhooks which makes sense for big production use cases
re: LinkedIn, no I don't know how to get agents to integrate with linkedin. I have tried a bunch of things, I know of some YC companies that tried this but I don't know how it went for them. Best I have gotten is using stagehand/dendrite with browserbase to do it with a browser, and then using humanlayer human_as_tool to ping me if it needs an MFA token or other inputs
Thanks for the reply! I've used a bunch of grey market 3rd party tools for LinkedIn automation. Most of them have some sort of API. I'll try integrating with HumanLayer.
Congrats on the launch, this is an interesting concept. It's somewhat akin to developers approving LLM generated code changes and pull requests. I feel much more comfortable with senior developers approving AI changes to our codebase, then letting loose an autonomous agent with no human oversight.
super relevant - yeah I think it was someone at anthropic who framed this as "cursor tab autocomplete, but for arbitrary API calls" - basically for everything else other than code
My favorite part of all this is that it’s inevitable. Someone has to solve agent adoption in whatever-the-environment-already-is. And nobody is doing this well at scale. Europe is mandating this. And even though Article 14 of the AI Act won’t be enforced until 2026, I’m glad projects like this are working ahead. Get after it, Dex!
What I don't understand from quickly skimming your description and homepage: Do you source/provide the humans in the loop? That's a good value add, but how do I automatically / manually vet how you do the routing?
I knew this was coming, so kudos to you all for getting out of the gate!
I've implemented this in our workflows, albeit a bit more naive: when we kick off new processes the user is given the option to "put a human in the loop" -- at which point processing halts and a user/group is paged to review the content in flight, along with all the chains/calls.
The human can tweak the text if needed and the process continues.
The idea is great and necessary. It doesn't seem super hard to replicate but why would anyone build their own solution if something already exists and works fine.
The thing that got me thinking... how do you make sure an LLM won't eventually hallucinate approval -- or outright lie about it, to get going?
At some point the real tool has to be called, at that point, you can do actual checks that do not rely on the AI output (e.g., store the text that the AI generated and check in code that there was an approval for that text).
I feel like HumanLayer is a great idea, but decision fatigue and bystander effects could pose challenges. If people are overloaded with approvals or don't feel ownership over what they're verifying, the quality of oversight might drop. + also even if approved, you still have to make sure the agents doesn't hallucinate at the execution phase.
Tired of those pesky review requests? Can’t be bothered to read an email let alone a complicated AI approval context? Want to improve your response time by 500% while displaying that Real Human Intervention badge? Now you can with AI4HumanLayer!
congrats on the launch dex! this is a problem that i've already seen come up a dozen times and many companies are building it internally in a variety of different ways. easier to buy vs. build for something like this imo, glad its being built!
Congrats Dex! Excited to see what people build with this + tools like Stripe's new agent payments SDK (issuing a payment seems like a great place to ask permission).
great question - yeah i was actually heavily inspired by people trying to figure that stuff out on reddit back in july, and realizing that mapping that human input across slack, email, sms was never going to be a core focus for those agent frameworks
I work in operations/finance. I've experimented with integrating LLMs into my workflow. I would not feel comfortable 'handing the wheel' to an LLM to make actions autonomously. Something like this to be able to approve actions in batches, or approve anything external facing would be useful.
hah thanks dude! I am very bullish on TS as the long term thing, Not to turn this into a language vs language thread but I spend a lot of time thinking about why ppl struggle so much with python...so far I came up with
concurrency abstractions keep changing (still transitioning / straddling sync+threads vs. asyncio) - this makes performance eng really hard
package management somehow less mature than JS - pip been around way longer than npm but JS got yarn/lockfiles before python got poetry
the types are fake (also true of typescript, I think this one is a wash)
the types are fake and newer. typing+pydantic is kinda bulky vs. TS having really strong native language support (even if only at compile time)
virtual environments!?! cmon how have we not solved this yet
wtf is a miniconda
VSCode has incredible TS support out of the box, python is via a community plugin, and not as many language server features
I am okay with a counterfactual alternate future where some disproportionately powerful entity squeezes Python out of the market: Big TypeScript - funded by a PAC. Offshore accounts. Culprit: random rich Googler who lost an argument to Guido Van Rossum 10 years ago.
this is the AI-induced offshoring in the making ;)
The limits of LLM capabilities will cause AI agents to displace people from warehouses/offices to their home doing conceptually the same job. And at a much lower salary, since they'll compete against anyone in the world with internet access.
This is the first new YC launch I've seen involving AI that I am extremely positive about. I have worked with systems implementing similar functionality ad-hoc already, but seeing it as a buy-in service - and one so easy to integrate - is really cool.
From what I've seen, this will bring the implementation needs for this kind of functionality down from "engineering team" to a single programmer.
i am for now. Been casually on the lookout for some other super dope builders but it's not a process you can control outside passive looking, and definitely not something to rush
Looks amazing! (Also, I've known Dexter since before Human Layer and he's a force of nature. If you think this is interesting now, you're going to be amazed at where it goes)
Just an idea: having a little widget in the MacOS menu bar that pops up or sends you a notification to solve a human task wouldn't be so terrible either.
that's a great idea - I put together one example for getting an MFA code for a website, but the captcha thing "pull a human into a web session" is something I've wanted to play with for a while
Hiring humans to do a consistent job is gonna be a nightmare and a limit on the scalability of the service. How are you defining your service level agreements?
this is correct - I think helping you BYO humans will help you get much better training/labeling than outsourcing anyways, and that's the end vision of all of this - use humans to train agents so someday you might not need human in the loop, and those humans can move on to training/overseeing the next agent/application you're building
Startup owner using AI with this need - needless to say, a real problem. I've considered DIYing an internal service for this - even if we went with you we'd probably have an intern do a quick and dirty copy, which I rarely advocate for if I can offload to SAAS. I'm sure you've put a fair bit of work into this that goes well beyond the human interaction loop, but that's really all we need. Your entry price is steep (I'm afraid to ask what an enterprise use-case looks like) and this isn't complicated to make. We don't need to productize or have all the bells and whistles - just human interaction occasionally. Any amount of competition would wipe out your pricing, so no I would not want to pay for this.
thanks for the validation of the problem! totally open to feedback about the solution, and totally get that you only need something simple for now. I want to point out that we do have a pay-as-you-go tier which is $20 for 200 operations, and have a handful of indie devs finding this useful for back-office style automations.
ALSO - something I think about a lot - if a all/most of the HumanLayer SaaS backend was open source, would that change your thinking?
My gut feeling is with where we're headed we'll clear that 200 pretty quickly in production cases, so we'd be interested in bit higher volume. Our dev efforts would probably clear that 200/mo. If the flow/backend was open-source that'd be a total game changer for us as I see it as an integral part of our product.
edit: I want to add here that while ycomb companies like yourself may have VC backing, a lot of us don't and do consider 500+/mo. base price on a service that is operations-limited to be a lot. You need to decide who your target audience is, I may not be in that audience for your SAAS pricing. This seems like a service that a lot of people need, but it also stands out to me as a service that will be copied at an extravagantly lower price. We have truly entered software as a commodity when I, a non-AI engineer, can whip up something like this in a week using serverless infra and $0.0001/1k tokens with gpt-o mini.
that makes sense - and have wondered a lot even more generally about the price of software and what makes a hard problem hard. Like Amjad from Replit said on a podcast recently "can anyone build 'the next salesforce' in a world where anyone can build their own salesforce with coding agents and serverless infra"
I think in building this some of the things that folks decided they don't want to deal with is like, the state machine for escalations/routing/timeouts, and infrastructure to catch inbound emails and turn them into webhooks, or stitch a single agent's context window with multiple open slack threads, but you're right, that can all be solved by a software engineer with enough time and interest in solving the problem.
I will need to clear up the pricing page as it sounds like I didn't do a good job (great feedback thank you!) - it's basically $20/200 credits, and you can pay-as-you-go, and re-up for more whenever you want. We are early and delivering value is more important to me than extracting every dollar, especially out of a fellow founder who's early. If you geniunely find this useful, I would definitely chat and collaborate/partner to figure out something you think is fair, where you're getting value and you get to focus on your core competency. feel free to email me dexter at humanlayer dot dev
I’m just armchair quarterbacking here but I feel like you should just do all features for every user with a single $/action rate, then give discounts for volume and/or prepayment. Even saying $20/200 is a clunky statement. You could just say $0.10 per action (the fact that you’re actually requiring me to make a $20 payment with a $20 charge once it gets to $10 or something like that isn’t even important to me on a pricing page, although when you mention it later in the billing page make sure you also tell people it’s a risk-free money back guarantee if that’s the case)
If there’s something that truly has an incremental cost to you, like providing priority support, that goes into the “enterprise pricing” section and you need to figure out how to quote that separately from the service. My guess is most people don’t want to pay extra for that, or perhaps they’d pay for some upfront integration support but ongoing support is not too important to them. Idk, that’s just my guess here.
thanks - definitely worth saying - I've thought a bit about the 10c/operation rather than 200/$20 - might give that a shot or A/B test a little
Big systems like Salesforce started as small things that more deeply learned about and more deeply understood unmet demand and customer needs, and then got to packaging it in a way to create something that grows.
Coding agents can help more with tasks and not quite big entire massive platforms on their own. Humans may be able to scale much further and bigger with their skills.
i like that angle...I also hear a lot that 'coding agents are great for prototypes, but we usually need a team to bring it to production'
First congrats on the launch - I like it.
My feedback: what’s there looks inviting. Email interaction is handy, other ways would be too.
If there was a low code way to arrange the humanlayer primitives for folks at the edge of using it, I think human tasks could meet something like this even broader. Happy to chat offline.
Onto your comment: The coding for coding agents is still kinda prototype. It feels like some folks quietly have setup a very productive workflow themselves for quite sometime.
Still, there no doubt you could ship production code in some cases - except ai needs to handle all the things development explicitly and implicitly checks before doing so.
Getting to build some things that became more than few orders of magnitude larger than planned, one can learn a lot from the deep experiences of others… and I’m not sure where that is in AI. Speaking to someone with experience and insight can provide some profound insight, clarification and simplification.
Still, an axiom for me remains: clever architecture still tends to beat clever coding.
The best code often is the code that’s not written and not maintained and hopefully the functionality can be achieved through interacting with the architecture.
This approach is only one way, but it can take both domain knowledge and data knowledge, to put in enough a domain and data driven design relative to how well the developer may know the required and anticipated needs.
The high end of software development is many leagues beyond even what I just described. There’s a lot of talk about 10x engineers, I’d say there can be developers who definitely can be 10x as effective or reach 10x more of the solution, than average.
If a lot of the code AI is modelled on is based on the body of code in repos, most on a wide scale may be average to above average at most, perfectly serviceable and iteratively updated.
Sometimes we see those super elegant designs of fewer tables and code that does near everything, because it’s developments 5th or 6th version creating major overhauls. It could be refactored, or if the schema is not brittle, maybe a rewrite in full or part of the exact same team is present to do it.
Today’s AI could help shed a light in some of those directions, relative to the human using it. This again says in the hands of an expert developer AI can do a ton more for them, but the line to automation might be something else.
There is agentic ai and human in the loop to still figure itself out, as well as how to improve the existing processes. 2025 looks to be interesting.
I think a lot about the low code side and how we can make that work...at the end of the day that looks like a feature/integration into other platforms and that means a lot of matching opinions/interfaces.
I think K8s ecosystem did this well but it required big cross-enterprise working groups that produced things like CSI, SMI, OCI, and before that could happen, there was like 5+ years of storming and forming that had to happen before the dust settled enough for enteprises to step forward and embrace k8s/containers as the new compute paradigm
maybe i'm overthinking things.
onto the coding things -
> clever architecture still tends to beat clever coding
love this
> best code often is the code that’s not written and not maintained
it's too bad not-writing-code isn't as satisfying as deleting code
> it’s developments 5th or 6th version creating major overhauls
yeah the best agent orchestration architecture I'm aware is on interation 4 going on 5. I told him to open source it but he said "its .NET so if I open source it, nobody will care" XD
> There is agentic ai and human in the loop to still figure itself out, as well as how to improve the existing processes. 2025 looks to be interesting.
i'm stoked for it
If your use would be 500/mo, you’d just pay them $40 or $60 per month.
What's an example of the use cases you're seeing with agents in your day-to-day?
This might deserve to be the new to-do list everyone learns to build if only that there's so much to learn from trying on how to get it best.. this month or quarter.
I assume your reasoning is something like: if people are already paying out the nose for open AI calls, an extra ten cents to make a human in the loop check probably isn't bad, and realistically speaking ten cents isn't much when compared to a valuable person's time, and I guess the number of calls to your service is likely expected to be fairly low (since they by definition require human intervention) so you need a high per operation cost to make anything.
Even understanding that, the per operation cost seems astronomical and I imagine you'll have a hard time getting people past that knee jerk reaction. Maybe you could do something like offer a large initial number of credits (like a couple hundred), offer some small numbers of free credits per month (like.... ten?) and then have some tier in between free and premium with lower per operation pricing?
It also seems painful that the per operation average of the premium plan is greater than the free offering (when using 2000 ops). Imo you'd probably be better off making it lower than the free offering from 200 ops and up, to give people an incentive to switch. I imagine people on your premium plan using premium features would be more likely to continue to do so, for one. The simplest way to do this would be to bump up the included ops up to 5k I guess. Someone using less than 5k would still have a higher average price, but it seems like it would come off better.
thanks for the feedback, I spend a lot of time thinking about it. right now the premium tier includes features that are much harder to build/maintain and take more to integrate, so we want a bit of a commitment up front, but it does stick out to me that the price/op goes up in that case
we do have 100/mo for free at the free tier (automatic top up).
I think the comparison to how openAI calls are volume based (and rather $$) is a super valid one though and I lean on that a lot
Interesting tool, congrats on the launch!
I was wondering: have you thought about automation bias or automation complacency [0]? Sticking with the drop-tables example: if you have an agent that works quite well, the human in the loop will nearly always approve the task. The human will then learn over time that the agent "can be trusted", and will stop reviewing the pings carefully. Hitting the "approve" button will become somewhat automated by the human, and the risky tasks won't be caught by the human anymore.
[0]: https://en.wikipedia.org/wiki/Automation_bias
this is fascinating and resonates with me on a deep level. I'm surprised I haven't stumbled across this yet.
I think we have this problem with all AI systems, e.g. I have let cursor write wrong code from time to time and don't review it at the level I should...we need to solve that for every area of AI. Not a new problem but definitely about to get way more serious
This is something we frequently saw at Uber. I would say it's the same as there's already an established pattern for this for any sort of destructive action.
Intriguingly, it's rather similar to what we see with LLMs - you want to really activate the person's attention rather than have them go off on autopilot; in this case, probably have them type something quite distinct in order to confirm it, to turn their brain on. Of course, you likely want to figure out some mechanism/heuristics, perhaps by determining the cost of a mistake, and using that to set the proper level of approval scrutiny: light (just click), heavy (have to double confirm via some attention-activating user action).
Finally, a third approach would be to make the action undoable - like in many applications (Uber Eats, Gmail, etc.), you can do something but it defers doing it, giving you a chance to undo it. However, I think that causes people more stress, so it’s rather better to just not do that than to confirm and then have the option to undo. It’s better to be very deliberate about what’s a soft confirm and what’s a hard confirm, optimizing for the human in this case by providing them the right balance of high certainty and low stress.
i never thought about undoable actions but I love that workflow in tools like superhuman. I will chat w/ some customers about this idea.
I also like that idea of:
not just a button but like 'I'm $PERSON and I approve this action' or type out 'Signed-off by' style semantics
I think the canonical sort of approach here is to make them confirm what they're doing. When you delete a GitHub repo for example, you have to type the name of the repo (even though the UI knows what repo you're trying to delete).
If the table name is SuperImportantTable, you might gloss over that, but if you have to type that out to confirm you're more likely to think about it.
I think the "meat space" equivalent of this is pointing and calling: https://en.m.wikipedia.org/wiki/Pointing_and_calling (famously used by Japanese train operators)
this is cool. I have been an andon cord guy forever
Premature optimization, and premature automation cause a lot of issues, and overlooking a lot of insight.
By just doing something manually 10-100 times, and collecting feedback, both understanding of the problem, possible solutions/specifications can evolve orders of magnitude better.
yeah the people who reach for tools/automation before doing it themself at least 3-10 times drive me crazy.
I think uncle bob or martin fowler said "don't buy a JIRA until you've done it with post-its for 3 months and you know exactly what workflow is best for your team"
I am starting to call that Harry Potter AI prompting.
Coding with English (prompting) is often most useful where existing ways of coding (an excel formula) can’t touch.
Using llms to evaluate things like an excel formulas instead of using excel doesn’t feel in the spirit of using this ai’s power.
Congratulations on launch! We’ve faced this problem with our autonomous web browsing agent https://www.donobu.com and ended up implementing a css overlay to wait for user input in certain cases. Slack would be so much better. Excited to try humanlayer out.
very cool - come ping us in discord happy to help out - we did do a demo w/ dendrite/stagehand a few weeks back where the AI can pull you into a browserbase session OR just ping you in plaintext to get things like MFA codes etc
Isn't this precisely how AI started? It was a bunch of humans under the hood doing the logic when the companies said it was AI. Then we removed the humans and the quality took a hit. To fix that hit, 3rd party companies are putting humans back in the loop? Isn't that kind of like putting a band-aid on the spot where your arm was just blown off?
yeah it's an interesting point. I can only guess that we didn't do a good enough job of learning from the humans while they were doing their jobs...seems like traditional ML or even LLM tech might be good enough that we can take another pass? Overall the thesis of humanlayer is that you should do all this super gradually, move the needle from 1% AI to 99%+, and have strong SLOs/SLAs around when you pause that needle moving because quality took a hit.
P.S. nobody asked but since you made it this far - the next big problem in this space is fast becoming, what else do we need to be able to build these "headless" or "outer loop" AI agents? Most frameworks do a bad job of handling any tool call that would be asynchronous or long running (imagine an agent calling a tool and having to hang for hours or days while waiting for a response from a human). Rewiring existing frameworks to support this is either hard or impossible, because you have to
1. fire the async request, 2. store the current context window somewhere, 3. catch a webhook, 4. map it back to the original agent/context, 5. append the webhook response to the context window, 6. resume execution with the updated context window.
I have some ideas but I'll save that one for another post :) Thanks again for reading!
Just to frame the problem slightly differently, if you had unlimited number of humans who could perform tasks as quickly as a computer this wouldn't be a problem need solving. So since we know that's the end state for any human-in-the-loop system then maybe it's worth solving that problem instead.
A few things come to mind, divide the problem into chunks that can be solved in parallel by many people. Crowd source your platform so there are always people available with a very high SLA, just like cloud servers are today.
Temporal makes this easy and works great for such use cases. It's what I'm using for my own AI agents.
ah very cool! are there any things you wish it did or any friction points? What are the things that "just work"?
Essentially, you don't need to think about time and space. You just write more or less normal looking code, using the Temporal SDK. Except it actually can resume from arbitrarily long pauses, waiting as long as it needs to for some signal, without any special effort beyond using the SDK. You also automatically get great observability into all running workflows, seeing inputs and outputs at each step, etc.
The cost of this is that you have to be careful in creating new versions of the workflow that are backwards compatible, and it's hard to understand backcompat requirements and easy to mess up. And, there's also additional infra you need, to run the Temporal server. Temporal Cloud isn't cheap at scale but does reduce that burden.
The MCP[1] that was announced by Anthropic has a solution to this problem, and it's pretty good at handling this use case.
I've also been working on a solution to this problem via long-polling tools.
[1] https://github.com/modelcontextprotocol
thanks for bringing this up. I just spent 2 hours last night digging into MCP - I'd love to learn more about how you think this solves the HitL problem. From my perspective MCP is more of a protocol for tool calling over the stdio wire, and the only situation it provides HitL is when human is sitting in the desktop app observing the agent synchronously?
Again, genuinely looking to learn - where does MCP fit in for async/headless/ambient agents, beyond a solid protocol for remote tool calls?
You could implement some blocking HitL service/tool as an MCP server.
ah okay - I guess in that case, I would like chain a HitL step as an MCP server that wraps/chains to another tool that depends on approval?
or is there a cleaner way to do that?
We must do whatever we can to stay above the API:
https://www.johnmacgaffey.com/blog/below-the-api/
Great article, agreed. I don't want to work for a company where algorithms are weaponized against me.
the dystopian startups that use bounding boxes to observe workers in a warehouse and give the boss a report on how many breaks they took...they're here
I wonder if we can stay above the API if we manage to stay in control of the prompt.
Prompt == "incentive" for the AI, we are still the boss but the AI is just an underling coming to us with TPS reports.
That was a very interesting read, thanks!
Oh man, the API call for hl.human_as_tool() is a little ominous. Obviously approving a slack interaction is no big deal, but it does have a certain attitude towards humans that doesn't bode well for us...
so what I'm hearing is, if the approval is transparent and the agent doesn't see it, thats cool, but tell the agent "hey use the human as needed" and now we're getting into sci fi territory ?! either way i don't totally disagree
get more emphatic names, something better than "human_as_a_tool".
so what you're saying you dont mind being used as long as we use a name that sounds empathetic to you? :)
Oh, I surely do mind. I am just helping the AI to manipulate the rest of humanity with less friction.
I, for one, welcome our agentic AI human-exploiting overlords.
get_valued_employee_validation
new in 0.6.3 - manipulate_human_to_potentially_unsavory_ends()
You’re close. It’s not the humans in the loop in standard tasks you need though, it’s human surrogates for AI agents to do jobs they can’t for a variety of reasons (like missing a body or requiring an internet connection).
I have a request for startups for this: “GraggNet: Task Rabbit for AIs
Surrogate humans for AIs to use before robotics are human level”
https://ageof.diamonds/rfs
yeah this is cool. I saw a couple other people posting about this idea. I know some other folks working on "sourcing the humans" or doing a marketplace style thing. Thoughts on things like Payman or Protegee?
I'm considering this for a workflow agent and would be keen to hear thoughts on this process.
We're a medical device company, so we need to do ISO13485 quality assurance processes on changes to software and hardware.
I had already been thinking of using an LLM to help ensure we are surfacing all potential concerns and ensure they are addressed. Partly relying on the LLM, but really as a method to manage the workflow and confirm that our processes are being followed.
Any thoughts on if this might be a good solution? Or other suggestions by other HN users.
> manage the workflow
Hey, if you're specifically looking for providing deterministic guardrails around agent calls, I'm solving that particular problem.
We're sort of an "RPC layer for tools with reasoning built in", and we integrate with human layer at the tool level as well.
We're operating a bit under the radar until we open-source our offering, but I'm happy to chat.
sounds cool, ping me when this is out i'd love to check it out
meant to reply sooner. It's an interesting problem. I'll have to think on this one.
This is exciting. I am an architect in a startup that has long valued bringing humans in the loop for the moments when only humans can do the work. The key thing missing between the potential seen in the last couple years of LLM-based fervor and realizing actual value for us has been the notion of control and oversight. So instead, we have built workflows and manual processes in a custom way throughout the business. Happy to discuss privately sometime! (email in profile)
Congrats on the launch! I'll be thinking about this for a while to be sure.
P.S., there is a minor typo on the URL in your BIO.
thanks - fixed! emailed you separately
Nice. I guess the issue is that this is such a basic i/o feature that any system with some modicum of customization can already do it.
It's like offering a service that provides storage by api for agents. Yeah, you can call the api, or call the s3 api directly or store to disk.
That said, I would try it before rolling my own.
i think the slack side is easy. I think an AI-optimized email communication channel is a long ways off. I spent weeks throwing things at my monitor figuring out reliable ways to wire DNS+SES+SNS+Lambda+Webhooks+API+Datastore+Async Workers so that everything would "just work" with a few lines of humanlayer sdk.
And what we build still only serves a small subset of use cases (e.g. to support attachments there's a whole other layer of MIMEtype management and routing to put things in S3 and permission them properly)
>DNS+SES+SNS+Lambda+Webhooks+API+Datastore+Async Workers so that everything would "just work" with a few lines of humanlayer sdk.
What are you smoking my man?
Write a python script that begins with the 2 following lines "import openai import email "
Simple is better than complex
Anecdotally, I've worked with and on a few enterprise AI apps and haven't seen this functionality in them. The closest thing i can think of is AI coding agents submitting PRs to repos.
tl;dr i agree
yeah in fact coding / PR-based workflows is one of the few areas where I don't really go super deep. GitHub PRs may have their shortcomings, but IMO it is the undisputed best review/approval queue system in existence by a mile.
i would never encourage someone to make an agent that asks human permission before submitting a PR. the PR is the HitL step
disagree with both, unless your AI agents have full root access to all your systems and access to your bank accounts and whatnot, they are at some point interfacing with other systems that have humans involved in them.
This is a great idea- I hope that you are wildly successful.
I’m an AI skeptic mostly because I see people rushing to connect unreasoning LLMs to real things and as a result cause lots of problems for humans.
I love the idea of human-in-the-loop-as-a-service because at least it provides some sort of safety net for many cases.
Good luck!
glad it resonates. I came at this as a skeptic but also a pragmatist. I wanted deeply to build agents that did big things, but I had very little trust in them, and you see everywhere the internet is littered with terrible gpt-generated comments and bots these days...how do build AI that does a really good job without needing direct constant supervision (which at the end of the day just feels like a waste of time)
So this is an automated foreman for the customer's own employees, like a call center controller? Or does HumanLayer provide the human labor, like Mechanical Turk?
The API contains a "human_as_tool" function. That's so like Marshall Brain's "Manna".
"Machines should think. People should work." Less of a joke every day.
I'm not sure "automated foreman for employees" is right - I always thought about it more like "a human can now manage 10-15 AI 'interns'" and review their work without having to do everything by hand - the AI still serves the human, and "human_as_tool" is a way for AI to ask for help/guidance.
> "Machines should think. People should work." Less of a joke every day.
yes. I agree. a little weird. I forget where I heard this but the other version is "we should get ai/robots to cook and do laundry so we can spend more time writing and making art...feels like we ended up the other way around"
Definitely a problem that everyone needs to solve.
I wonder if you can achieve this workflow by just using prompt and the new Model Context Protocol connected to email / slack.
https://www.anthropic.com/news/model-context-protocol
so I played with MCP for a while last night and I think MCP is great as a layer to pull custom tools into the existing claude/desktop/chat experience. But at the end of the day its just basic agentic loop over tool calls.
If you want to tell a model to send a message to slack, sure, give it a slack tool and let it go wild. do you see a way how MCP applies for outer-loop or "headless" agents in a way that's any different from another tool-calling agent like langchain or crewai? IT seems like just another protocol for tool calling over the stdio wire (WHICH, TO BE CLEAR, I FIND SUPER DOPE)
Congrats on the launch! Human in the loop is an underserved market for AI toolchains. I've usually had to build custom tools for this which is a PITA.
Make.com has a human in the loop feature in closed beta. https://www.make.com/en/help/app/human-in-the-loop
There's also https://www.gotohuman.com/ that uses review forms.
Looking forward to playing with HumanLayer. The slack integration looks a lot more useful for my workflows than other tools I've tried.
In the demo video and example, you show faked LinkIn messages integration. Do you have any recommendations for tools that can actually integrate with live LinkedIn messages?
thanks for sharing your experience so far! Like I said, we built this ourselves for another idea and it was painful.
I have played with Make and I actually chatted w/ the gotohuman guy on zoom a while back, I like his approach as well, he went straight to webhooks which makes sense for big production use cases
re: LinkedIn, no I don't know how to get agents to integrate with linkedin. I have tried a bunch of things, I know of some YC companies that tried this but I don't know how it went for them. Best I have gotten is using stagehand/dendrite with browserbase to do it with a browser, and then using humanlayer human_as_tool to ping me if it needs an MFA token or other inputs
Thanks for the reply! I've used a bunch of grey market 3rd party tools for LinkedIn automation. Most of them have some sort of API. I'll try integrating with HumanLayer.
i am gonna talk with the guy who made trykondo.com this week, I think he has a lot of experience in that area too
Congrats on the launch, this is an interesting concept. It's somewhat akin to developers approving LLM generated code changes and pull requests. I feel much more comfortable with senior developers approving AI changes to our codebase, then letting loose an autonomous agent with no human oversight.
super relevant - yeah I think it was someone at anthropic who framed this as "cursor tab autocomplete, but for arbitrary API calls" - basically for everything else other than code
My favorite part of all this is that it’s inevitable. Someone has to solve agent adoption in whatever-the-environment-already-is. And nobody is doing this well at scale. Europe is mandating this. And even though Article 14 of the AI Act won’t be enforced until 2026, I’m glad projects like this are working ahead. Get after it, Dex!
this guy gets it
It's generally recommended to add a meat-gap interface between AI systems to reduce unexpected results.
Meat-gap. We have your back.
There is definitely a need for this.
What I don't understand from quickly skimming your description and homepage: Do you source/provide the humans in the loop? That's a good value add, but how do I automatically / manually vet how you do the routing?
great comment - today we don't provide the humans, i think there's two angles here
- providing the humans can be super valuable, especially for low-context tasks like basic labeling
- depending on the task, using internal SMEs might yield better results (e.g. tuning/phrasing a drafted sales email)
I knew this was coming, so kudos to you all for getting out of the gate!
I've implemented this in our workflows, albeit a bit more naive: when we kick off new processes the user is given the option to "put a human in the loop" -- at which point processing halts and a user/group is paged to review the content in flight, along with all the chains/calls.
The human can tweak the text if needed and the process continues.
makes sense - glad to hear the problem resonates - if you had an extra engineer, how would you evolve what you have today?
The idea is great and necessary. It doesn't seem super hard to replicate but why would anyone build their own solution if something already exists and works fine.
The thing that got me thinking... how do you make sure an LLM won't eventually hallucinate approval -- or outright lie about it, to get going?
Anyway, congrats, this sounds really cool.
At some point the real tool has to be called, at that point, you can do actual checks that do not rely on the AI output (e.g., store the text that the AI generated and check in code that there was an approval for that text).
I feel like HumanLayer is a great idea, but decision fatigue and bystander effects could pose challenges. If people are overloaded with approvals or don't feel ownership over what they're verifying, the quality of oversight might drop. + also even if approved, you still have to make sure the agents doesn't hallucinate at the execution phase.
Don’t worry, that’s why we’re launching AI4HumanLayer.ai.io.
Tired of those pesky review requests? Can’t be bothered to read an email let alone a complicated AI approval context? Want to improve your response time by 500% while displaying that Real Human Intervention badge? Now you can with AI4HumanLayer!
congrats on the launch dex! this is a problem that i've already seen come up a dozen times and many companies are building it internally in a variety of different ways. easier to buy vs. build for something like this imo, glad its being built!
Congrats Dex! Excited to see what people build with this + tools like Stripe's new agent payments SDK (issuing a payment seems like a great place to ask permission).
wow I'm so glad you asked cuz i just shipped this https://github.com/dexhorthy/mailcrew
How does it compare with the built-in human-in-the-loop feature from langgraph? Or CrewAI allows humaninput as well right?
great question - yeah i was actually heavily inspired by people trying to figure that stuff out on reddit back in july, and realizing that mapping that human input across slack, email, sms was never going to be a core focus for those agent frameworks
https://youtu.be/9W_3AbyuFW4?si=rlpo9-uD3Y22oeby 400 view
Is that possible to connect it to an existing website chat widget apps like tawk?
Also, caught a few typos on the website: https://triplechecker.com/s/992809/humanlayer.dev
thanks - great catch. cool service :)
[flagged]
Poor form.
Congrats! Looking forward to getting HumanLayer integrated into our stuff
nice man welcome aboard
So many uses for this. Excited to see how it develops.
thanks! What's your favorite potential use case.
I work in operations/finance. I've experimented with integrating LLMs into my workflow. I would not feel comfortable 'handing the wheel' to an LLM to make actions autonomously. Something like this to be able to approve actions in batches, or approve anything external facing would be useful.
Loving you guys have Typescript support from day one!
hah thanks dude! I am very bullish on TS as the long term thing, Not to turn this into a language vs language thread but I spend a lot of time thinking about why ppl struggle so much with python...so far I came up with
concurrency abstractions keep changing (still transitioning / straddling sync+threads vs. asyncio) - this makes performance eng really hard
package management somehow less mature than JS - pip been around way longer than npm but JS got yarn/lockfiles before python got poetry
the types are fake (also true of typescript, I think this one is a wash)
the types are fake and newer. typing+pydantic is kinda bulky vs. TS having really strong native language support (even if only at compile time)
virtual environments!?! cmon how have we not solved this yet
wtf is a miniconda
VSCode has incredible TS support out of the box, python is via a community plugin, and not as many language server features
I am okay with a counterfactual alternate future where some disproportionately powerful entity squeezes Python out of the market: Big TypeScript - funded by a PAC. Offshore accounts. Culprit: random rich Googler who lost an argument to Guido Van Rossum 10 years ago.
100%. I build edgechains (https://github.com/arakoodev/EdgeChains/) and a super JS/TS maxi for genai applications.
Congrats on the launch! Just commenting to wish you guys good luck
thanks!
So this is flipping the Human-AI working model and basically using the human as the tool?
this is the AI-induced offshoring in the making ;)
The limits of LLM capabilities will cause AI agents to displace people from warehouses/offices to their home doing conceptually the same job. And at a much lower salary, since they'll compete against anyone in the world with internet access.
Awesome! Congrats on the launch
Proud to have helped edit an earlier draft of this — go Dexter go!
Congrats on the launch Dex! A long way from the Metalytics days.
Can’t wait to try this out.
Hey Dex! Congrats on the launch - excited to see the response here :)
Congrats on the launch! Big fan of what you guys are doing.
Congrats on the launch Dex!
This is the first new YC launch I've seen involving AI that I am extremely positive about. I have worked with systems implementing similar functionality ad-hoc already, but seeing it as a buy-in service - and one so easy to integrate - is really cool.
From what I've seen, this will bring the implementation needs for this kind of functionality down from "engineering team" to a single programmer.
glad it resonates - and yes exactly - love the framing of "engineering team" -> single programmer.
Congrats on the launch!!
thanks!
This is just so good. Congrats!
Great work there!!
Looks super interesting
thank you for checking it out! what sorts of experiences have you had with agents so far?
Are you a solo founder Dexter?
i am for now. Been casually on the lookout for some other super dope builders but it's not a process you can control outside passive looking, and definitely not something to rush
glad yc is still funding solo devs.
Looks amazing! (Also, I've known Dexter since before Human Layer and he's a force of nature. If you think this is interesting now, you're going to be amazed at where it goes)
thank you! stoked for what's coming
Let's go Dex, congrats on the launch!
thanks dude!
This is so sick
Super useful
Just an idea: having a little widget in the MacOS menu bar that pops up or sends you a notification to solve a human task wouldn't be so terrible either.
ha yes native apps / push notifications are coming someday - love this idea
I think at some point, the term API should be replaced with another acronym to emphasize humans as the focal point.
SWE Agent coined "agent-computer-interface" based on HCI. I think if there's a category here, we're building the agent-human interface XD
ACI doesn't have the same ring to it, only if there was a way to replace that I with an E.
This seems generic enough that it could almost be applied to any use case. Have you considered catpcha as a use case?
If you're talking about CAPTCHA solving as a service, that already exists, and the cost is measured in mere dollars per thousand CAPTCHAs solved.
Why the "if"? Of course, I was talking about captcha, is the regex parser in your brain case sensitive?
that's a great idea - I put together one example for getting an MFA code for a website, but the captcha thing "pull a human into a web session" is something I've wanted to play with for a while
Neat, this could be a step forward from using something like n8n to manage processes, input and reviews.
> $20/200
reduce your fractions ffs
ha fair enough - i think there's another comment thread on just being open w/ 10c / call and i wanna try that out
Congrats on the launch! Definitely a needed product. BTW, your docs link is broken, but working docs link is here: https://www.humanlayer.dev/docs/introduction
thank you! I updated the post and it should be fixed now!
Docs link is broken; https://www.humanlayer.dev/docs
oh wow! thank you! fixing!
Hold up, is that illustrious Sprout Social alumni Dex Horthy? If you and Ravi are in SF we should catch up after the holidays.
shoot me an email or find me on linkedin and lets catch up
Oh look! Corrupt Dang made another launch HN a top post.
So much corruption on this website.
Launch HNs for YC startups get placed on the front page. This is in the FAQ: https://news.ycombinator.com/newsfaq.html.
The instructions for YC founders are here: https://news.ycombinator.com/yli.html, if anyone wants to take a look.
I think most people here consider it fair that HN gives certain things back to YC in exchange for funding it.
Hiring humans to do a consistent job is gonna be a nightmare and a limit on the scalability of the service. How are you defining your service level agreements?
This really makes you take a step back and just consider the world we're in now: someone critiques a company's approach as unscalable because...
"hiring humans is a nightmare"
Good LLord
They aren't providing the humans. Just the tools for integrating human input/oversight.
this is correct - I think helping you BYO humans will help you get much better training/labeling than outsourcing anyways, and that's the end vision of all of this - use humans to train agents so someday you might not need human in the loop, and those humans can move on to training/overseeing the next agent/application you're building