In honor of the upcoming 4th of July weekend, I’ve declared this week to be Pivot Table week. What do Pivot Tables have to do with Independence day? I’ll let you know when I find out.
In the meantime, are you ‘that guy’ who keeps clogging everyone’s email with 20 megabyte files. I want to share a simple trick to cut the size of your Pivot Table workbooks in half.
Each pivot table report you create from a separate data source will create its own pivot Cache that will increase your file size. The increase in file size depends on the size of the original data source that is being duplicated to create the pivot Cache.
Of course, the benefit you get from a pivot cache is optimization. Any changes you make to the pivot table report, such as rearranging fields, adding new fields, or hiding items, are made rapidly and with minimal overhead.
The down side of the pivot cache is that is basically doubles the size of your workbook. So every time you make a new pivot table from scratch, you essentially add to the file size of your workbook.
Note that pivot tables which share the same pivot cache will also share calculated fields, calculated items, and groupings.

You can delete your source data and your pivot table will function just fine. After deleting the source data, saving will shrink the file.
The only functionality you will lose is the ability to “Refresh” the pivot data, as the source data is not there. But DataPig, how do I get back to the raw data if I delete it?
Simple. Just double click the intersection of the row and column totals. This will create a new sheet with all the data currently contained inside the pivot cache.

On this 4th of July weekend, you should check out Debra’s site www.contextures.com. She’s Canadian, but she’s smart like an American.
RELATED STUFF


The link to PivotCacheFix.zip is broken.
Thanks Maurico. Corrected.
To reduce the file size I uncheck ‘Save data with table layout’ in PivotTable Options. If I want the underlying data to be available the next time I open the spreadsheet I click the ‘Refresh on open’ box in PivotTable Options.
Or you can reduce the size of your cache by orders of magnitude by modifying your ‘get data’ query so that it only actually gets the data you need for the specific task.
For instance, if you’re sucking every single line item from every single order for every single customer from a database to Excel, only to aggregate the data up to monthy totals across major product groups, then you’ve got MAJOR redundancy in your pivot cache.
You should filter and aggregate at the database end wherever possible. An exception might be when you’re examining data with fresh eyes, and you’re just pulling in a whole heap of different fields to “see what you can see”. This is what I’ve been doing at the place I’m currently contracted to, because I’m trying to find the best metrics and the best patterns from scratch. But even then you should go back later and filter out the ‘boring bits’ and just keep the stuff you actually need in the cache.
Great point Jeffrey.
I’m a bit late reading this – but love the tip for deleting the source sheet. didn’t realize we could do that. thanks!
I’ve got 6 Sheets in the Excel file and each of them has pivot table. The sourse of data for each of these pivot tables is the same table in MC Access. Here comes the qestion, how can I avoid duplication??
Thank You.
PS Sorry for my poor englis, I’m from Russia
Hi there
I was directed to your site thru the Excel G forum, i have a a workbook with two pivot tables – however I cannot find th PT1 ( i know it exists as the wizard for PT2 says Pivot table to in thenoptions dialogue box. I tried the Pivot Cache Fix but it did not work for me – is there anyother way of deleting a pivot table and its cache?
Andy: The PivotTable name in the Options dialog box is not a true index number. You very well could have only one pivot table named PivotTable2. In other words, the name field has nothing to do with the number of PivotTables that actually exist.
Is there a way to clear the pivot table cache ?
I would be interested in clear the pivot table cache before a pivot table refresh, that way, I would be assured that the data received would be in datasource order, especially if using ‘none’ as your pivot field sort order.
I am trying to avoid using the sort option, as it sometimes interferes with the grouping option (causes it to sort in correctly).
Any suggestions on clearing the pivot table cache?
Thanks in advance.
You may want to reconsider deleting the data tab after creating a pivot table in Excel 2003 if you have cells with more than 255 characters.
Excel will truncate the contents of the cached version to 255 characters. I don’t know if this is the case with newer versions of Excel.
They notice changes in the market trends and recreate their internet marketing tools as well as website content periodically. They put emphasis on the core areas of their business that bring the most benefits. A lot many customers come in to buy when they know that they get the best value for their money. Internet business marketing is also about establishing individual contacts with your prospects as well as existing customers.
Hi, thanks for the video tutorial.
Is there an add-in for excel 2007? If so, where can I download this from?
Hi Mike. Regarding your comment “The increase in file size depends on the size of the original data source that is being duplicated to create the pivot Cache.” do you know exactly how?
Recently I’ve created a couple of pivot tables and noticed that their file sizes are much LESS than a spreadsheet containing just the raw data. For instance, if I create 3 complete rows of =randbetween(0,10000) in excel 2007, then here’s the resulting file size from those 3.14 million cells:
raw data only 26.4 MB
Pivot table only 12.6 MB
Pivot table and data 39.1 MB
This really surprises me. Do you know why this might be?
Jeff: The difference is formatting. In a pivot cache, only the raw text needs to be stored. In a spreadsheet, you have formatting which can take up space.
Thanks That’s a heck of a lot of unwanted formatting. I was naughty and cross-posted a question about this over at http://www.dailydoseofexcel.com/archives/2004/11/26/creating-a-simple-pivot-table/#comment-48929 (not because you’re not the font of all excel knowledge, mind! Dick thought the same thing, although I’ve just commented that there’s still some stuff puzzling about all this to me…check this out:
On my home pc, using excel 2007 if I create 3 complete rows of =rand() then convert them to values, then:
1: when I save the workbook, I get a file size of 50 MB.
2. If I then create a pivot table on a second sheet from this data, but don’t explode it (i.e. don’t put anything in the data or column/row areas) I get 83.6 MB for pivot and data. So it seems the pivot alone adds 34 MB.
3. If I then explode the pivot table (i.e. put all three columns in the row area) so that the pivot resembles the original data source, I get 163 MB for pivot and data, which is quite some increase. I’d expect to actually see 50 MB (raw data) plus 34 MB (Pivot Cache) plus perhaps another 50 MB MAX for the extra resource required to store the cell formatting info for the range used by the exploded pivot. So the most I’d expect this to take up would be 134 MB, but it runs to 163 MB.
4. If I delete the raw data on Sheet 1, the file size drops 50 MB to 113 MB for the exploded pivot only, as you’d expect.
5. If I drag the three columns out of the pivot, then I get 52 MB for the unexploded pivot – which is almost the same file size as for the data alone. (Note I deleted all cells below and to the right of the pivot in order to try to make sure excel’s used range was limited to the pivot only. Not sure that this worked, given this high file size) So my pivot only seems to have grown from a 34 MB footprint when created to 52 MB just because I exploded it then contracted it.
6. If I reopen the file from (2), delete the data, and save the unexploded pivot, I get a file size of 34 MB…which was what I was expecting in step 5.
It seems very odd that if you expand a pivot table out, then contract it, the file size doesn’t decrease accordingly.
Maybe this means that if you’re playing around with a pivot that at some time you’ve exploded, then you might want to recreate your final product from scratch to cut down on some file size.
Excellent Article! I live and breathe pivot tables (PT) and did not understand why they made the files so big. Knowing there is a cache explains why you need to Refresh your PT when you change the source data.
You suggested copying and pasting a PT, rather than creating a new PT, in order to keep the size down. I assume using the Excel Move or Copy sheet option (displayed after right clicking on the sheet tab) does exactly the same.
Excel also provides a tip on saving space when creating more than one PT in the same file. If you start the PT wizard and select the same source data range as was used for another PT, Excel displays a message. It says you can save space by basing the new PT on the old PT data. You are offered a Yes or No option to do so.
Please bring your road show to New Jersey so I can thank you in person for your outstanding contributions to the average guy.
Will having multiple pivot tables point to the same pivot cache when connecting to an OLAP datasouce be a problem?
It is very helpful!! Thanks!
Never thought you could delete the data. I will have to try it out on a large (30mb) pivot table file i have.
the delete source data doesn’t work. If I delete the tab containing this data, the file still stays the same size because it’s linking back to this source.