The Home of Fantasy Sports Analysis

Excel-ing at Fantasy Baseball: PART 5 – The Joy of Duplicates

Welcome to part 5 of Excel-ing at Fantasy Baseball. If you haven’t read the first three installments of the series yet, I’d recommend you start there (part 1, part 2, and part 3part 4), so you can see the material we’ve covered (and maybe even go back to the other article I wrote linked that showed some of the benefits of Excel), to see how powerful and useful a tool Excel can be in Fantasy Baseball. It is also vital as we move on this series that you understand the fundamentals of the previous articles. Finally, we have the beginning of your own spreadsheet built over the last few weeks that we will continue to build this week, so skip the first 4 parts at your own risk.


What?! Your fantasy baseball league is not using Fantrax? Inconceivable! Check out everything Fantrax has to offer, and I’m sure you’ll come around to our way of thinking.


Review/Recap

Part 1 we covered:

  • the basics of formulas
  • cell references and how to lock a cell reference
  • some basic formulas such as:
    • sum
    • average
    • count
  • and how to download and import data into a sheet

Part 2 we covered

  • Tables
    • How cell references work
    • How to name a table
    • The ease of copying formulas within tables

Part 3 we covered

  • Ways to look up data
    • Vlookup
    • Index-match
    • Xlookup

Part 4 we covered

  • Conditional Formatting
  • Logic Formulas
  • Importing tricky ranks
    • Delineation
    • Text To Columns
    • Combining text
    • Cleaning up text

Today we’ll look into updating ranks briefly and then look at how to remove and clean up duplicates. This process got far more involved than I initially expected with some of the intricacies of the system so, as I will remind you later – please read the process/options before deciding which route you will. Take

Updates

First – Eric updated his top 500 yesterday – so of course we must update our copy of Eric’s top 500. So to start – let’s go to the article.

When a ranker changes their ranks it will vary in complexity and ease of updating the spreadsheet. For ranks like Eric’s, it’s very simple since you can just copy and paste the new data (starting in the first row with numbers and player names) and paste it in row 2. Our “rank 2” column will update from column 1and then the overall sheet will promptly update from that (if you do xlookup you won’t have a rank 2 column).

The advantage of this format with lookups is that you only need to update the Eric Cross page and every other table will sync because of the links. You should NEVER (well pretty much never) have to update data in multiple spots on any sheet – there should be formulas pulling the data together across the different pages.

To double-check you pulled the ranks correctly – go to your ranks sheet and we’ll look for Adbert Alzolay, who Eric astutely moved the amazing young pitcher up from 468 to 322. Go to the top to the ranks column – hit the down arrow – type in Adbert and make sure it says 322:

Dealing With Duplicates in Excel

Two is not better than one – just more confusing

Earlier in the series, we noted that there are duplicates in our player pool – multiple Wander Francos – etc. Our lookup formulas have no way to differentiate this and will look up the first one they find. This same problem happens with Will Smith – do you want to know where Eric, Chris, Dave, and others rank the reliever or the catcher?

This is a tricky thing to fix. Most rankers will only have some of these players – maybe both Will Smiths, most likely only one of the Wanders. Given this, you’ll have to decide how you want to build out your sheet to adjust for this.

One option is don’t do anything for most of them. This is not clean, nor would I advise it, but it’s certainly an option, though not a very good one.

A better is to figure out these duplicates and edit them.

Since the ranks sheet is the place from which all the others sheets pull from – that is where we will start.

So let’s go to the rank’s page and highlight the name column. We could simply use the “remove duplicates” tool under data – but that is a rough tool for something like this as it doesn’t help us know WHICH player that code will decide to delete. So, instead go to conditional formatting -> highlight cells rules -> and select highlight duplicates.

You can now scroll through and when you see red it means a duplicate and you can get rid of whichever one you decide you don’t care about having in your sheet. To help with this let’s add a team column next to the position column.

But wait – our positions are looked up from the league page with all the players in it… as would our team, so we actually should go to that page first.

But wait again… what about when we need to update our player pool and ownership and so we redownload the sheet – won’t anything we change be removed?

Correct!

So – let’s go to a team sheet (for me 30 rock) and right-click on the sheet to move or copy and add a copy at the end of all the sheets and name the new sheet Player Pool:

Next, follow the steps above highlight duplicates on the player pool page and now we can get rid of duplicate players here.

If we sort the player column by “Wander Franco” – we see the SF 3B and the TB phenom SS. No offense to the SF wander Franco, but you now have two options. (Please read through all of this before touching anything in your spreadsheet).

Perhaps the cleanest option is simply combining the player, team, and position so that you have all the players still but something more than a name. We can do this and will need to then also do it on all of the other ranking sheets. For the sheets we pulled in from Fantrax that is not a problem. For the other two sheets it is a little trickier because, as you may recall, our delineation and sorting resulted in deleting the teams and/or positions on some of those. If you decide to go this route, after we go through how to do it on the sheets that you can do it by yourself easily, you’ll need to do it for the Eddy’s and Dynasty Halp’s sheets. This will involve redownloading the data from last week and going through that process – but in a way to keep team and position. If you decide to go that route and need help, feel free to reach out. If I have time I may later post these sheets pre-sorted.

If you want to go with this route, then what we will do is on player pool rename player as player 1, add a column named Player after the team name column, and then make a column that combines the player, the position, and the team. As we discussed last week, we’ll use the & to combine the text and end up with a formula like this:

=[@[Player 1]]&” “&[@Team]&” “&[@Position]

Then on the Cross 500 page do the same thing, changing player to player 1, adding a new column named player, and doing the same sort of formula ( make sure team is first and then position).

One thing that is important to note – if you do this approach. If you are using xlookup the flip will be easy enough because it’s about column names on these tables and you can change those simply. If you used index match or vlookup there will be some other changes, so maybe this is not the approach you want to take to fix this. However – don’t necessarily take the path of least work when building a sheet like this upfront if it will save you headaches and work later.

So if you are finishing this – copy and paste VALUE only from the player page from the column with name, position, and team into the first column on the ranks page. Then you need to go through column by column and change the Player 1 in the formula to Player if you use xlookup or adjust as necessary for other lookups.

If you decide not to do this route and so you do not have to change the past coding or need to use the player positions and teams when you look up a player in the … then you will need to go through the duplicates and decide what to do.

To return to the Wander example – you can just simply delete the Wander Franco from SF row altogether, and then go through the rest of the table and when you see a highlighted duplicate – delete that row or the other less fantasy-relevant person sharing that name. If you do this and then copy that new list of names into the first column on the ranks sheet, you won’t have to change any old code. You may still need to, with this approach, do something like Will Smith C and Will Smith RP for those two players (or any other players where there are two relevant players with the same name), and then update them on any sheet you pull in to match that formatting. There are very few players who are duplicates who you likely want both of their names in your sheet since they are probably not ranked or rostered unless you play in a very very deep league. Case in point – there are no duplicate names on either Chris’ or Eric’s most recent lists.

When deciding which of these routes to take there are a few things I would urge you to consider:

  • Do you want every player possible listed? – option 1
  • Do you want to have to edit code we’ve already done and put in player team name and position for every player whenever you look them up? – option 1
  • Do you want to go through and manually delete all of the duplicate players? – option 2
  • Do you not want to do any of that work? – Go back to question one and say yes before getting to question 4.

Once you decide what you want to do then you can go ahead and follow either of the processes above. The conditional formatting of duplicates should still apply so if you run into any future duplicates you will know they are a duplicate.

We’ll stop here after this more involved article. Next week we’ll get back to some more simple and immediately fantasy impactful code, building team pages, and, either next week or the following, a shares page.


Fantrax logo

Fantrax was one of the fastest-growing fantasy sites of 2020, and we’re not stopping now. With multi-team trades, designated commissioner/league managers, and drag/drop easy click methods, Fantrax is sure to excite the serious fantasy sports fan – sign up now for a free year at Fantrax.com.

Leave A Reply

Your email address will not be published.