AppleScript Tricks: Finding and formatting cells in Excel

johncwelch
New Contributor

To answer the obvious: I didn't use Numbers because it doesn't have the scripting support I need for the specific things I'm doing, and also, it's sometimes not good with really big spreadsheets.

So, like a lot of people, when I do archives to tape, I build a spreadsheet of the files in a given archive so that folks can easily find files they need restored. However, when you have say, gobs of folders and potentially hundreds of thousands of files, you want to do a bit of formatting to help folks out. Doing this manually would suck, so of course, I script it.

To create the initial list of files, I use ls -alhRFP on the main archive directory. This gives me a list of files by folder, with permission and basic kind data. I then open that in Excel. Now, what I want to do is to bold the name of each folder, and then change the cell background color, so the folder names stand out.

Doing this by hand is possible, but..no. Just no. That's dumb work. I could of course, use VBA, but VBA is a bit odd for me, I've not used it in yeaaaaaarrrrrs, and the debugging tools for VBA are also a bit odd. So I use AppleScript and Late Nite Software's AWESOME Script Debugger.

The script I built is below, and we'll go through it, because while Excel has excellent scripting support, it is a bit...odd.

set theSearchInfo to text returned of (display dialog "Enter the path to be used by Excel to format the spreadsheet
Include leading period" default answer "./Archive")

tell application "Microsoft Excel"

     tell active sheet
            tell used range
            set endOfRange to (count of rows) as text
        end tell
        --set theRange to "A1:" & endOfRange
        set theSearchRange to range ("A1:A" & endOfRange)
        set theHeaderRow to find theSearchRange what (theSearchInfo & "*:") look at whole search order rows search direction next
        set theTestCell to the first row index of theHeaderRow
        set font style of font object of theHeaderRow to "Bold"
        set color of interior object of theHeaderRow to {141, 180, 226}
        set theStart to first row index of theHeaderRow
        repeat with x from theStart to (endOfRange as integer)
            set theHeaderRow to find next theSearchRange after theHeaderRow
            if the first row index of theHeaderRow = theTestCell then
                exit repeat
            else
                set font style of font object of theHeaderRow to "Bold"
                set color of interior object of theHeaderRow to {141, 180, 226}
            end if
        end repeat
    end tell

end tell

The first part is pretty basic. I ask for the path Excel is going to try to find. I ask for the leading period because I do the ls commands not from /, but from root of the archive directory. The default answer provides an example of what I'm looking for.

Next we get into the Excel-specific command. This bit:

tell used range
    set endOfRange to (count of rows) as text
end tell

is one of the things I love about scripting apps like Excel. It knows its own used range in a given sheet. So I simply tell the used range object to give me the count of rows within itself, and that number is put in the endOfRange variable. Saves a lot of time trying to figure the used range out the hard way.

Next, we have to build a range value that Excel will use in the search. It's the format of: Starting ColumnLetterRowNumber:Ending ColumnLetterRowNumber. Since I've only one column, A, this is easy to set up. I create a range object that starts at A1 and ends at column A and whatever I have in the endOfRange variable. That range object goes in theSearchRange.

set theSearchRange to range ("A1:A" & endOfRange)

Now for the initial find, (I'll explain initial in a few):

set theHeaderRow to find theSearchRange what (theSearchInfo & "*:") look at whole search order rows search direction next

This is setting theHeaderRow to the result of looking in theSearchRange for what ends up being theSearchInfo with "*:" concatenated on the end. Excel's basic find feature is kind of rudimentary, so I'm using the initial part of the directory listing with a wildcard, and an ":" on the end to build my search string. Somewhat primitive in terms of searching, but in this case, it gets the job done well.

"look at whole" is telling find to look at the entire range. search order is specifying that the search order is row-based, not column-based, (so I'm searching down, not across), and that my search direction is looking for the next match, not the previous match.

(note: if you're debugging a large list, my initial list was 555K rows, you may want to arbitrarily set the search range to a smaller size, i.e. 2000 rows or less. Otherwise, Script Debugger will try to find pertinent data for EVERY row, and the script will time out, or take forever. )

once that's found, we take the value of theHeaderRow, and shove that into theTestCell, to be used later:

set theTestCell to the first row index of theHeaderRow

Next, we set the font formatting and the RGB background color of that first found cell:

set font style of font object of theHeaderRow to "Bold"
set color of interior object of theHeaderRow to {141, 180, 226}

Then we copy the same value we put into theTestCell into theStart:

set theStart to first row index of theHeaderRow

Remember when I said I'd explain "initial find"?

So Excel has an odd way of automating find. you have Find, Find Next, and Find Previous. The definitions are as follows:

Find: Finds specific information in a range, and returns a range object that represents the first cell where that information is found. Returns nothing if no match is found. Doesn't affect the selection or the active cell.

Find Next: Continues a search that was begun with the find method. Finds the next cell that matches those same conditions and returns a range object that represents that cell. Doesn't affect the selection or the active cell.

Find Previous: Continues a search that was begun with the find method. Finds the previous cell that matches those same conditions and returns a range object that represents that cell. Doesn't affect the selection or the active cell.

Excel has different functions for what would normally just be a part of the Find command. Okay, odd, but sure, we can work with that. First, we set up a repeat loop that goes from theStart to endOfRange, which means we start counting at the first cell the find command found:

repeat with x from theStart to (endOfRange as integer)

(I like to make sure my values are what I think they are, if there's a chance they might be something else.)

Next, we tell find next to do another find, starting at theHeaderRow, searching within theSearchRange. Note we don't have to re-specifiy the find command info. Find Next automatically uses that:

set theHeaderRow to find next theSearchRange after theHeaderRow

Just in case, we do a bit of bounds checking. The explicit range should help avoid problems here, but it doesn't hurt to be extra careful:

if the first row index of theHeaderRow = theTestCell then
    exit repeat

So if it turns out we've wrapped and are starting from the beginning again, this should catch that and kill the loop.

Every time Find next gets a hit, format that cell:

else
    set font style of font object of theHeaderRow to "Bold"
    set color of interior object of theHeaderRow to {141, 180, 226}
end if

and close up all the repeats and tells:

end repeat
    end tell

end tell

That's the entire script, and it runs through a sheet with around 100K rows in about 5 minutes. not blinding fast, but faster than I could ever do it manually, and with what ends up being much less chance for error.

Minor plug here: if you are going to do a lot of AppleScript, buy Script Debugger. It's not cheap, but it will save you SO much time by giving you a grown-up IDE for AppleScript, complete with proper debugging.

0 REPLIES 0