I’m rather curious to see how the EU’s privacy laws are going to handle this.
(Original article is from Fortune, but Yahoo Finance doesn’t have a paywall)
I’m rather curious to see how the EU’s privacy laws are going to handle this.
(Original article is from Fortune, but Yahoo Finance doesn’t have a paywall)
I don’t doubt that mathematically, but practically that sounds like it would be functionally equivalent to just retraining the model. Like if it were more efficient to just calculate the model weights based on input data, that’s what we would do, there would be no need to go through the training process. We could just start with a completely untrained model and calculate the difference between that model and one that was trained with all the data. The more I think about it the more I doubt that mathematically. The feasibility of this would depend heavily on the details of the model and how it was trained. Lots of times the order in which the data was presented during training has an impact on the final result, so there’s no guarantee your subtraction would achieve the same or even similar result as retraining without the specified data. Maybe you can reference some papers on the topic.
You are correct. It would be heinously expensive to “remove” training data. Even training a very rudimentary model can take hours on a high-end tensor processor.