Yep, it wouldn't be too surprising if it does already exist. It'd also potentially function something like a more advanced diff tool that can provide time-saving efforts to those who use it.
If mature and reliable enough it could actually be used to predict which changes are likely to cause future problems, and then the code reviewer(s) would have to decide how they'd like to act based on that information.
Very cool. One could do this over time, and then automatically find out where values change between different versions of the code. Might be able to find bugs before they ever occur.
This is actually really similar to something I've been wanting to build for a long time. In my case I've thought it would be useful to have a way to calculate the likelihood for a given change to break things based on the history of breaking changes in the same file or area of the file. Basically a riskiness score for each change. The risk score could be associated with each PR and would provide a signal for reviewers about which code should get a bit of extra attention as well as highlighting the risky changes when they are being deployed.
The tricky part would be tracking the same part of the code as it moves up and down because of insertions/deletions above it which would cause problems for a naive algorithm based on line numbers.
Just doing it at the file level, like this does, might be good enough to be useful though.
I could also imagine to store metadata from the ide next to the pr. The ide knows a lot of things about the change when doing refactorings that do not trivially reflect in the diff. E.g. Extractions and renamings...
if I could do a review and the system would tell what happened instead of what changed in some cases, that would be so useful.
never ever reviewing 1000 lines of class renamings anymore...
Sounds good, although it would have to be context aware. For example, code that often gets removed in a production environment might be dissimilar to choose that is typically removed in dev or testing.
There are also other triggers of code removal and refactoring that are outside the code base, such as an organisation migrating to a different platform. An AI trained on a large public commit history could encourage a general shift towards already-established big players, punishing smaller organisations.
This seems like a good enterprise use-case for AI. We had automated visual diffs at my last company, but it was very noisy. It did help to confirm my change didn't visually impact some other code I wasn't aware of, but we still had to write intelligent UI unit tests.
Writing those UI tests would save a ton of time, and if the AI can determine intent of the code change to correctly fail/pass tests that would eliminate all my debug work, leaving just the feature code to me. Hopefully AI doesn't take that away from me too ;)
It is a net positive, I just wish they would focus more on their core users. Their basic code diff tool could use a lot of love to make it on par with the diffs generated from other tools.
Tools like this seem like a great way to manage breaking changes if done right. Far too often, frequently with JavaScript, I'll find myself with a project whose dependencies are 2-3 major versions out of date, and the documentation of these versions' breaking changes is of course very poor and lacking in sufficiently detailed examples.
I like the motivation – including a programmer's active concerns in language design – but I'm not sure of the purpose of this change tracking.
When does a person look at changes? Reviewing effects, reviewing code changes, maybe finding a regression...
Source code alone doesn't pull that together. Also because there's no changelog, reviewing and understanding changes is going to be harder.
If there was a way of matching up all those changes with program effects, it could be interesting. I.e., match up source code diffs with execution diffs. I'm not exactly sure what an "execution diff" is, but I think it could be cool ;)
What I'm saying is that you could just integrate some tooling which would detect when changes are or are not made in a compatible way, instead of relying on the human making a correct annotation.
You could try a cool method of random inspection. Have a daemon that listens for submissions to your tree. With a random chance (you determine probability) have it email the contextual diff (maybe even highlighted full diff) and the changelist description to a random developer.
If scenario 2 or 3 is a regular occurrence then 1 of 2 things is true - your team is too large for the code base (so people are crawling over each other), or the underlying code is too coupled together (so people are stepping on each others toes) - either way the team is going to spend too much time merging in each others changes - compared to say writing actual code.
Reviewboard has had the ability to compare different versions of the diff for a long time. I don't see anything new or innovative being proposed in this post.
I have yet to come across a review tool which provides inline code coverage (with before and after snapshots) - this would allow me to check that a) any tests included actually exercise the new code b) the new code hasn't impacted the testability of old code. - It isn't a guarantee but it would be an easier way to do the job.
This is clever but truly content based change detection in make would seriously fix a bunch of issues. I'm sort of surprised it hasn't been done already.
The next big thing will be semantic code versioning. I.e. a tool that understands language syntax and can track when a symbol is changed.
There are already experiments in this area[1], but I don't think any are ready for mass adoption yet.
The fact that it's 2022 and we're still working with red/green line-based diffs is absurd. Our tools should be smarter and we shouldn't have to handhold them as much as we do with Git. It wouldn't surprise me if the next big VCS used ML to track code changes and resolve conflicts. I personally can't wait.
If there's tooling that can do this efficiently it sounds like it can be a win since all changes in implementation details become explicit and can potentially be more easily tracked.
If mature and reliable enough it could actually be used to predict which changes are likely to cause future problems, and then the code reviewer(s) would have to decide how they'd like to act based on that information.
reply