Highly recommend checking this out, especially the links to Will’s sklearn example code. Data is not included, but if you need a starting spot for learning how to use the training and prediction pipeline in sklearn, it’s a great place to start.
In the figure below, I’ve used a pass probability model (see end of post for details and code) to estimate the difficulty in completing a pass and then compared this to the actual passing outcomes at a team-level. This provides a global measure of how much a team disrupts their opponents passing. We see the Premier League’s main pressing teams with the greatest disruption, through to the barely corporeal form represented by Sunderland.