Selection: Waffle#

A waffle() is a method for selecting cells that exist at the intersection of two selections.

It’s a concept that’s easiest to explain with examples.

Source Data#

The data source we’re using for these examples is shown below:

Note - this particular table has some very verbose headers we don’t care about, so we’ll be using bounded= to remove them from the previews as well as to show just the subset of data we’re working with.

The full data source can be downloaded here. We’ll be using th 6th tab named “Table2a”.

from tidychef import acquire, preview
from tidychef.direction import right, up
from tidychef.selection import XlsxSelectable

table: XlsxSelectable = acquire.xlsx.http("https://github.com/mikeAdamss/tidychef/raw/main/tests/fixtures/xlsx/ons-oic.xlsx", tables="Table 2a")
preview(table, bounded="A4:H10")

Table 2a

ABCDEFGH
4£ million
5Time periodPublic new housingPrivate new housingTotal new housingInfrastructure new workPublic other new workPrivate industrial new workPrivate commercial new work
6Dataset identifier codeMV3WMV3XMVL9MV3YMV3ZMV42MV43
7199721581758819690140156010872125517
8199817481776419448136276334888527655
9199915151598017436132477129916730963
10200019011785319693124306753818231200

Now lets image we have two selections as follows

codes = table.excel_ref("B5:H5").label_as("Codes")
years = table.excel_ref("A7:A10").label_as("Years")
preview(codes, years, bounded="A4:H10")
Codes
Years

Table 2a

ABCDEFGH
4£ million
5Time periodPublic new housingPrivate new housingTotal new housingInfrastructure new workPublic other new workPrivate industrial new workPrivate commercial new work
6Dataset identifier codeMV3WMV3XMVL9MV3YMV3ZMV42MV43
7199721581758819690140156010872125517
8199817481776419448136276334888527655
9199915151598017436132477129916730963
10200019011785319693124306753818231200

Now we have our column values we want to get observations - this is where waffle() comes in!

The syntax is as follows:

<selection1>.waffle(<direction>, <selection2>)

So in our example we’re starting with the “years” selection and looking right to create a waffle against the “codes” selection, as follows:

waffled_selection = years.waffle(right, codes).label_as("Values")
preview(waffled_selection, bounded="A4:H10")
Values

Table 2a

ABCDEFGH
4£ million
5Time periodPublic new housingPrivate new housingTotal new housingInfrastructure new workPublic other new workPrivate industrial new workPrivate commercial new work
6Dataset identifier codeMV3WMV3XMVL9MV3YMV3ZMV42MV43
7199721581758819690140156010872125517
8199817481776419448136276334888527655
9199915151598017436132477129916730963
10200019011785319693124306753818231200

Why the direction?#

Is the typical question at this point. It’s because you can always interpret a waffle along more than one axis.

As an example, lets use the same command but look to waffle up rather than right.

waffled_selection = years.waffle(up, codes).label_as("Values")
preview(waffled_selection, bounded="A4:H10")
Values

Table 2a

ABCDEFGH
4£ million
5Time periodPublic new housingPrivate new housingTotal new housingInfrastructure new workPublic other new workPrivate industrial new workPrivate commercial new work
6Dataset identifier codeMV3WMV3XMVL9MV3YMV3ZMV42MV43
7199721581758819690140156010872125517
8199817481776419448136276334888527655
9199915151598017436132477129916730963
10200019011785319693124306753818231200

You cannot waffle overlaps#

The waffle() method is very strict in that it will not allow you to waffle two selections that overlap on the relevant axis.

Again, we’ll show this with an example.

codes = table.excel_ref("B5:H5").label_as("Codes")
years = table.excel_ref("C7:C10").label_as("Years")
preview(codes, years, bounded="A4:H10")
waffled_selection = years.waffle(right, codes)
Codes
Years

Table 2a

ABCDEFGH
4£ million
5Time periodPublic new housingPrivate new housingTotal new housingInfrastructure new workPublic other new workPrivate industrial new workPrivate commercial new work
6Dataset identifier codeMV3WMV3XMVL9MV3YMV3ZMV42MV43
7199721581758819690140156010872125517
8199817481776419448136276334888527655
9199915151598017436132477129916730963
10200019011785319693124306753818231200

---------------------------------------------------------------------------
AmbiguousWaffleError                      Traceback (most recent call last)
Cell In[5], line 4
      2 years = table.excel_ref("C7:C10").label_as("Years")
      3 preview(codes, years, bounded="A4:H10")
----> 4 waffled_selection = years.waffle(right, codes)

File ~/.pyenv/versions/3.12.11/lib/python3.12/site-packages/tidychef/utils/decorators/dontmutate.py:24, in dontmutate.<locals>.wrapper(self, *args, **kwargs)
     21 @wraps(method)
     22 def wrapper(self, *args, **kwargs):
     23     self = copy.deepcopy(self)
---> 24     return method(self, *args, **kwargs)

File ~/.pyenv/versions/3.12.11/lib/python3.12/site-packages/tidychef/selection/selectable.py:537, in Selectable.waffle(self, direction, additional_selection)
    535     highest_x = dfc.maximum_x_offset(self.cells)
    536     if any([x for x in additional_selection if x.x <= highest_x]):
--> 537         raise AmbiguousWaffleError(
    538             "When using waffle right, your additional selections must all "
    539             "be right of your initial selections."
    540         )
    541 if direction.is_left:
    542     lowest_x = dfc.minimum_x_offset(self.cells)

AmbiguousWaffleError: When using waffle right, your additional selections must all be right of your initial selections.