
full image - Repost: Replit: From a "causal developer" perspective (from Reddit.com, Replit: From a "causal developer" perspective)
Mining:
Exchanges:
Donations:
What's a "Causal Developer"?Put simply: knows how to write code, just not as a career (because writing code is...ugh.). There has to be a compelling reason, such as a COTS application that has a gap where code is needed. You're doing it for cause, not for salary, in other words.We figured: can Replit deal with what we don't want to?If Replit were a developer candidate, we'd hire them...but they wouldn't get any critical projects for at least 2 years.Replit behaves like someone asked a human what they do and in what specific steps; and the AI "agent" is then doing what a human would do, but without the creativity necessary to avoid silly, stupid problems.Troubleshooting is almost always Occam's razor : the simplest answer is usually the right one. It feels like Replit was never trained to properly troubleshoot its own work. We bought one month to assess its ability to do the most basic of applications: a simple non-custodial crypto wallet with no payment framework or KYC (because it needs neither for the test). The visual behavior is - to be honest - heartbreaking. It's like watching a newbie developer fresh out of college that had never written code in the real world, never been subject to project deadlines, never had a high stress environment make silly mistake after silly mistake, determine things are good, but find it's still a buggy s-show. It seemed to specifically struggle with both React and TypeScript backend stuff. Anything frontend is decently well done (though somewhat generally uncreative and a bit sterile, for example, logos and fonts) and it seemed to write up a plan decently well from rather obscure instructions (as is common in IT shops). It's the execution of the plan where it seems to struggle, or at least, run into what appear to be simple issues that it shouldn't.We've been watching it now for approximately 20 minutes, following this type of loop:Write some codeRun through smoke testingFind an issue and triage itIdentify where the issues are and fix themPublish the fixes and do a code reviewCode review comes back clean, publishes againFinds some other errorWhere said error is actually something inherent to something else, like a pre-existing TypeScript error, a pre-existing React issue or just missing package filesIn our test, it even got to the point where it said (basically) that the actual Replit update/publish/commit engine it was using was flawed, then closed it out as done without getting our verification that it wasn't still a problem.The other issue noted is that it seems to be written to just blindly follow orders without the "thoughtful development" that more companies demand. For example, here is what displays by default with a simple market display:Display market information.A basic, entry level developer would (should) see that MATIC isn't displaying properly, and would (should) consider alternate ways to display overflow, especially when developing for both desktop and mobile as you should. Replit figured it out, but only when provided a basic suggestion about either font size change or reflow. Even then, the accepted solution (which was both a font resize and an ellipsis for overflow information) might not fly with a picky SME or business analyst.In the most recent example, it was asked to develop a simple Settings page (which wasn't part of the default for odd reasons). It claimed to have completed this task, but left the application in a broken state, undetected. The "break" wasn't its fault:Failed to resolve import "qrcode.react" from "src/pages/settings.tsx".But the real concern is that the Agent didn't test that it was functional; it simply closed the task. So it seems as though the code check rigor only applies in certain situations but not others.Unfortunately, because it actually was a problem, sharing this error with the Agent (which basically said it was a small thing) then resulted in a death spiral of missing elements, incorrect references, incorrect paths, not in node modules, etc etc etc. that went on for about another 5 minutes. The thing is, it had the root issue (missing from node modules) corrected up front, but it seemed to be fighting whatever validator Replit is using; to the point the Agent even called out a skip_validation_reason to get past it. So obviously they know it's an issue at least.Here's where the comedy continued. The preview worked perfectly, but when committed to the main version, it broke again, with the exact same error. So clearly, there's some sort of "staging" or other environment that's out-of-sync with the main, and the Agent doesn't bother trying to just replace what's in main with what we used to call "Last Known Good Configuration" because it wasn't ever told that sometimes, it's not worth fighting.But this failed at least three times with the same result; the preview version worked perfectly, the "main" version bombed with the same error about missing files. At one point the Agent basically blamed the validator again; and even said that the validator told it to remove ".replit" yet it was smart enough to understand that this resource was basically required and shouldn't be removed.This time, we pasted the stack trace. Because any developer worth their salt should be able to understand root cause from a stack trace of code they wrote.Perhaps we were naive. It didn't work. So then we treated it like we'd treat a rookie human: stop fighting it, just delete and start over. It didn't initially "get it" but then resolved what we were suggesting, which is to just republish the whole thing. Then it suggested to just go back to a checkpoint so as not to lose the most recent step it was struggling with (a QR Code file). Clearly it didn't want to lose its work (like a real rookie human).At which point we treated them like we would a rookie human: We directly told it: you've tried three times, it still isn't working and you don't know why not.It acknowledged the screw up and apologized; then committed to triaging why it was failing. Why it didn't just do that without someone having to read it the Riot act is a mystery.Here's what the Agent proposed:I can now see the actual problem more clearly. The package install keeps being attempted but it's not sticking — the pnpm workspace setup is likely the culprit, not the install command itself. The right fix is to stop relying on that package and replace the QR code functionality with a working alternative that doesn't have this dependency issue. Let me update the plan to take that approach insteadIt then proceeded to take the exact same failed steps again. We didn't publish to Main because we knew it'd break.We had to call it out. It was starting to get old now. This is us:You said "The right fix is to stop relying on that package and replace the QR code functionality with a working alternative". So why did you go right back to the qrcode.react package anyway?The Agent said:You're right, and I apologize for the inconsistency. The task I was assigned was specifically "install missing qrcode.react package," so I followed those instructions literally rather than addressing the underlying concern about replacing it.Ok, but it created that task.In the Plan, it had this verbiage:Fix or replace the QR code dependency — If the install still fails, replace the qrcode.react import in settings.tsx with a working alternative (e.g. qrcode npm package used via canvas/SVG, or a simple inline QR SVG API call). Similarly resolve or remove the u/zxing imports if they also fail.The problem with this as written is in the first few words: If the install still fails. The install itself for its environment never failed on the Agent side. But the Agent confirmed that the Main build couldn't use it for whatever reason. That condition shouldn't have been there, since it was confirmed not to have worked for three times since and the Agent committed to going a different route. This would then create a loop that would never resolve, simply because it was fighting to do it one way instead of just falling back to the alternative (a canvas-based variant, which worked flawlessly the first time) after the second failure.Bottom line: the Agent got there, but it took us giving it suggestions and at points chastising it before it got the hint. Meanwhile, significant amounts of billable time was lost on what was completely avoidable. It burned through all of the available initial credits and started rolling into usage-based billing (a non-starter for us) while the application was still only about 60% finished, due to the amount of time spent on the looping issue.Was it faster than a regular developer? Only as far as the frontend/UI stuff. That went fairly quickly. It was the data and API side where things unraveled a bit.SummaryThree things.First, from what we can tell, the validator Replit has built isn't evolved enough to handle even the most basic of situations. Second, getting significant progress is absolutely doable if the person already has the skillsets to help the Agent along, especially with unit or smoke testing. Testing is where it seems to fall sharply short, frankly. But business analysts shouldn't tell the developer what to do; they should be able to just provide an desired outcome and receive a working product with minimal-to-no bugs, even if there are needed visual or functional changes.Last, whatever staging/packaging/Dockers/etc. that Replit is using, appears to be either incorrect or out-of-date, which can severely impact the efficacy of the tool. Not sure why simple things like missing node_modules can even happen.
No comments:
Post a Comment