News:

Per request, I added a Forum Status page while revamping the AARoads back end.
- Alex

Main Menu

Microsoft Excel

Started by kphoger, July 15, 2020, 03:44:24 PM

Previous topic - Next topic

kkt

Excel is very useful.  I used it often at work too - mostly for projecting budgets out to the end of the fiscal year to see if we're in trouble.  It is way too enthusiastic about automatically reformatting numbers to scientific notation.  That's the scorpion napping in your boot.


abefroman329

Quote from: Rothman on October 31, 2022, 06:17:32 PMI don't understand not using functions with Excel.  They take tasks that take an hour or more and reduce the time down to minutes you can count on one hand.
I wouldn't be much of an Excel user if it wasn't for the data analytics bootcamp I completed a couple years ago, and even now, I will do some calculations manually if it's going to take too long to clean the data. 

Dirt Roads

Quote from: kphoger on October 31, 2022, 01:38:52 PM
New frustration!

I have the serial number column in Excel formatted to highlight duplicate values, and I had one pop up red the other day.  And I was all prepared to dispute the charges–about ready, in fact, to hit 'Send' on my e-mail to alert the MSO of my dispute.  But then I noticed something...

This particular model of modem uses an 18-digit all-numeric serial number.  Because Excel likes to convert such numbers into scientific notation and lop off significant digits, I have to format cells as text–which I do before pasting-as-values into the cell.  But apparently that's not quite how things work in Excel.

You see, even though the cells are supposedly formatted as text, and even though each cell correctly displays all 18 digits of the serial number, Excel nevertheless apparently doesn't bother to read all of the digits when checking for duplicate values.  As it turns out, the two serial numbers on my list–that Excel determined were duplicates–were only identical to the first 15 digits.

Specifically, they were these two numbers:
212930061077217513
212930061077217161

When the data came over from the MSO, they were each treated as numbers.  The cell displayed '2.1293E+17', and the formula bar showed a stored value of '212930061077217000' (doing away with the last three significant digits).  However, on my own running list files, they are formatted as text and show the full number string in both display and formula bar.

Just in case you haven't solved this issue yet, a simple way to manually add a single quote character " ' " to the start of each long number that is red flagged.  This will force the $number$ to be treated as a $string$.

Specifically, change the two numbers to forced strings:
'212930061077217513
'212930061077217161

If you don't want to manipulate the original data (as an auditor, I wasn't allowed to) you can create a separate column that uses a VisualBasic function to force the number to be convered to text.  In the older versions of VisualBasic (and Excel), there was a simple function STR$(variable) that would allow you to not manipulate the original data.  The new function is TEXT(variable,length) and it appears that setting length to zero will adapt to numbers of any length (someone might need to double check this).

Anywhoosit, the two cells that have long numbers are E2 and E48.  In the other column you would want to have the function  =TEXT($E$2,0)  and  =TEXT($E$48,0)  to force both of those numbers to text.  All of the other cells in that column can have =E1, =E3, =E4 etc. since those values are already text strings.

kphoger

Quote from: jemacedo9 on October 31, 2022, 05:25:46 PM
Instead, highlight the column, then do a Data : Text to Columns step.  Most times this is used to split data from one column to multiple columns.  But in this case, you're not going to split.  In the first step, select Fixed Width, then in the second step don't add any columm breaks, and in the third step,you can select Text instead of Number, and that actually converts the value to a text value within the same column.

I just tried this.  No difference, non-duplicate values are still highlighted as duplicates.

As I said earlier, my solution for turning them into text is to format an empty column as text, then paste-as-values into that column–not the other way around.

Quote from: Dirt Roads on October 31, 2022, 09:23:04 PM
Anywhoosit, the two cells that have long numbers are E2 and E48.  In the other column you would want to have the function  =TEXT($E$2,0)  and  =TEXT($E$48,0)  to force both of those numbers to text.  All of the other cells in that column can have =E1, =E3, =E4 etc. since those values are already text strings.

I just tried this.  I added a column and entered   =TEXT(E2,0)   in the cell.  (I tried it with dollar signs too.)  Still not useful, because it the value it returns still ends up lopping off the last three significant digits, turning them into zeroes instead.

Quote from: Dirt Roads on October 31, 2022, 09:23:04 PM
Just in case you haven't solved this issue yet, a simple way to manually add a single quote character " ' " to the start of each long number that is red flagged.  This will force the $number$ to be treated as a $string$.

Specifically, change the two numbers to forced strings:
'212930061077217513
'212930061077217161

Nope.  Not a solution.  They still flag as duplicate values.

He Is Already Here! Let's Go, Flamingo!
Dost thou understand the graveness of the circumstances?
Deut 23:13
Male pronouns, please.

Quote from: PKDIf you can control the meaning of words, you can control the people who must use them.

hotdogPi

While it won't be the actual serial number anymore, you could add a non-numeric character. Something like "A212930061077217513", "Serial 212930061077217513", etc., maybe even something outside ASCII that doesn't look like it changes the number but is enough for it to know it's not a number...
Clinched

Traveled, plus
US 13, 50
MA 10,22,35,40,53,63,79,107,109,126,138,141,151,159
NH 27,78,111A; CA 90; NY 9A,40,366,423; RI 117; CT 32,193,320; VT 2A,5A; PA 3,51,60, GA 42,140; FL A1A,7; WA 202; QC 162,165,263; 🇬🇧A100,A3211,A3213,A3215; 🇫🇷95 D316

New traveled: I-80, I-287, I-684, CT 72, GSP

kalvado

Quote from: 1 on November 01, 2022, 10:18:11 AM
While it won't be the actual serial number anymore, you could add a non-numeric character. Something like "A212930061077217513", "Serial 212930061077217513", etc., maybe even something outside ASCII that doesn't look like it changes the number but is enough for it to know it's not a number...
Then you have to remember to ALWAYS add that same thing to ALL numbers, and that is a bigger mess

kphoger

Quote from: kalvado on November 01, 2022, 11:43:39 AM

Quote from: 1 on November 01, 2022, 10:18:11 AM
While it won't be the actual serial number anymore, you could add a non-numeric character. Something like "A212930061077217513", "Serial 212930061077217513", etc., maybe even something outside ASCII that doesn't look like it changes the number but is enough for it to know it's not a number...

Then you have to remember to ALWAYS add that same thing to ALL numbers, and that is a bigger mess

And remember to remove it whenever I copy-and-paste from Excel into an e-mail...  which I do every time...

He Is Already Here! Let's Go, Flamingo!
Dost thou understand the graveness of the circumstances?
Deut 23:13
Male pronouns, please.

Quote from: PKDIf you can control the meaning of words, you can control the people who must use them.

abefroman329

Quote from: kphoger on November 01, 2022, 11:50:42 AM
Quote from: kalvado on November 01, 2022, 11:43:39 AM

Quote from: 1 on November 01, 2022, 10:18:11 AM
While it won't be the actual serial number anymore, you could add a non-numeric character. Something like "A212930061077217513", "Serial 212930061077217513", etc., maybe even something outside ASCII that doesn't look like it changes the number but is enough for it to know it's not a number...

Then you have to remember to ALWAYS add that same thing to ALL numbers, and that is a bigger mess

And remember to remove it whenever I copy-and-paste from Excel into an e-mail...  which I do every time...
You could use CONCAT to add a letter to the start of the string and the RIGHT function to remove it.

Dirt Roads

Quote from: Dirt Roads on October 31, 2022, 09:23:04 PM
Anywhoosit, the two cells that have long numbers are E2 and E48.  In the other column you would want to have the function  =TEXT($E$2,0)  and  =TEXT($E$48,0)  to force both of those numbers to text.  All of the other cells in that column can have =E1, =E3, =E4 etc. since those values are already text strings.

Quote from: kphoger on November 01, 2022, 10:10:27 AM
I just tried this.  I added a column and entered   =TEXT(E2,0)   in the cell.  (I tried it with dollar signs too.)  Still not useful, because it the value it returns still ends up lopping off the last three significant digits, turning them into zeroes instead.

It turns out that the second field in the =TEXT(x,format) function is a standard format code.  So the correct function is =TEXT(E2,"000000000000000000") with 18 zeroes (or more).  Tried it on my version of Excel and it still didn't work, which reveals the main problem:  both the E2 cell and the E8 cell have the number 212930061077217000 which reveals the loss of three significant digits. 

Quote from: Dirt Roads on October 31, 2022, 09:23:04 PM
Just in case you haven't solved this issue yet, a simple way to manually add a single quote character " ' " to the start of each long number that is red flagged.  This will force the $number$ to be treated as a $string$.

Specifically, change the two numbers to forced strings:
'212930061077217513
'212930061077217161

Quote from: kphoger on November 01, 2022, 10:10:27 AM
Nope.  Not a solution.  They still flag as duplicate values.

Indeed, if you manually add the single quote character " ' " at the beginning of each cell without retyping the entire number, Excel indeed continues to lose the last three significant digits and the cell reads as '212930061077217000.  The only way I was able to force Excel to accept the data was to manually enter '212930061077217513 into a cell and '212930061077217161 into another cell.  No loss of data and Excel correctly compares the two strings as non-identical.   Sorry for the runaround.

SSOWorld

Excel > Outlook > Visio > PowerPoint > Word >>>>>>>>>>>> everything else

Who needs to waste time writing a program when you can use a spreadsheet to write test steps - unless of course there's repetition when one can then turn to a node.js Yeoman generator with ejs templates.
Scott O.

Not all who wander are lost...
Ah, the open skies, wind at my back, warm sun on my... wait, where the hell am I?!
As a matter of fact, I do own the road.
Raise your what?

Wisconsin - out-multiplexing your state since 1918.

Scott5114

The problem I have with using complex spreadsheet formulas is much the same as I have with Parserfunctions in MediaWiki–the syntax is so dense it makes my eyes glaze over. Debugging a chain of nested IF()s in that tiny little box is a special kind of hell.  And as a result I never feel confident that what I wrote is actually right.

I can usually write a short Perl script in the same time it would take me to do anything other than simple calculations in a spreadsheet program, and be entirely confident that it's right because I can actually keep track of which test goes with what code.
uncontrollable freak sardine salad chef

kphoger

Counting parentheses is a special kind of irritation.

Also... when you're in the middle of writing an Excel formula... and need to check something on another spreadsheet... when you click on that other sheet, the cell reference pops into your formula.

He Is Already Here! Let's Go, Flamingo!
Dost thou understand the graveness of the circumstances?
Deut 23:13
Male pronouns, please.

Quote from: PKDIf you can control the meaning of words, you can control the people who must use them.

abefroman329

Quote from: Scott5114 on November 07, 2022, 10:23:05 PMDebugging a chain of nested IF()s in that tiny little box is a special kind of hell.

You can always copy and paste the formula into Notepad.  Adding a hard return between each IF statement helps too.  But there's no getting around the fact that, sooner or later, you're gonna be banging your head against the desk, trying to figure out why the function won't work, when it won't work because you forgot a comma.

kalvado

Quote from: abefroman329 on November 08, 2022, 10:25:49 AM
Quote from: Scott5114 on November 07, 2022, 10:23:05 PMDebugging a chain of nested IF()s in that tiny little box is a special kind of hell.

You can always copy and paste the formula into Notepad.  Adding a hard return between each IF statement helps too.  But there's no getting around the fact that, sooner or later, you're gonna be banging your head against the desk, trying to figure out why the function won't work, when it won't work because you forgot a comma.
As if something like that never happened in (insert your favorite language)

hotdogPi

Quote from: kalvado on November 08, 2022, 10:43:15 AM
Quote from: abefroman329 on November 08, 2022, 10:25:49 AM
Quote from: Scott5114 on November 07, 2022, 10:23:05 PMDebugging a chain of nested IF()s in that tiny little box is a special kind of hell.

You can always copy and paste the formula into Notepad.  Adding a hard return between each IF statement helps too.  But there's no getting around the fact that, sooner or later, you're gonna be banging your head against the desk, trying to figure out why the function won't work, when it won't work because you forgot a comma.
As if something like that never happened in (insert your favorite language)
Python doesn't use nested parentheses for if statements; it uses whitespace and linebreaks. And if a comma is missed (which typically occurs for data that spans multiple lines, not for code), it will immediately tell you as a syntax error when trying to run it.
Clinched

Traveled, plus
US 13, 50
MA 10,22,35,40,53,63,79,107,109,126,138,141,151,159
NH 27,78,111A; CA 90; NY 9A,40,366,423; RI 117; CT 32,193,320; VT 2A,5A; PA 3,51,60, GA 42,140; FL A1A,7; WA 202; QC 162,165,263; 🇬🇧A100,A3211,A3213,A3215; 🇫🇷95 D316

New traveled: I-80, I-287, I-684, CT 72, GSP

Jim

Tradeoffs that the programming languages research community has argued about for decades...
Photos I post are my own unless otherwise noted.
Signs: https://www.teresco.org/pics/signs/
Travel Mapping: https://travelmapping.net/user/?u=terescoj
Counties: http://www.mob-rule.com/user/terescoj
Twitter @JimTeresco (roads, travel, skiing, weather, sports)

kphoger

Bear in mind that I've never used any of those other whatchamagoozits y'all are talking about, but...

There's something I like about how Excel does it.  There's one particular report I run every pay-period, and it includes conditional formulae and conditional formatting.

For example, =IF(OR(Y1221="YES",AND(K1221=K1220,X1221=X1220)),"REMOVE","PAY")

I add to this spreadsheet every day until payroll comes around.  While I'm adding data, I leave that column blank.  Then, when it's time to run payroll, I sort my data and then fill in that column.  After that, it's pivot table and VLOOKUP time.

Anyway, I start a new spreadsheet every pay-period.  It's very convenient to simply copy from the formula bar on Cell Z3 in the old sheet, then paste into the formula bar on Cell Z3 in the new sheet.  No need to figure out the syntax every time.

He Is Already Here! Let's Go, Flamingo!
Dost thou understand the graveness of the circumstances?
Deut 23:13
Male pronouns, please.

Quote from: PKDIf you can control the meaning of words, you can control the people who must use them.

Scott5114

Quote from: kphoger on November 08, 2022, 01:32:16 PM
=IF(OR(Y1221="YES",AND(K1221=K1220,X1221=X1220)),"REMOVE","PAY")

The equivalent Perl would be:

if ( ($k1221 eq $k1220 and $x1221 eq $x1220) or ($y1221 eq "YES") )
{
    print "REMOVE";
}
else
{
    print "PAY";
}


There's still a little bit of parenthesis nastiness, but I think that's going to happen no matter what just because of the nature of the comparison you're making.

"eq" does a string comparison (i.e. it compares non-numerical data). If you were comparing numbers, you'd use "==" instead. (Most newer languages just have "==" handle both types of comparison.)

I find it a lot easier to understand what Perl is doing there. But of course the tricky part isn't getting Perl to do the comparison, it's getting the data into Perl to get to where you can do the comparison.

I feel like the best of both worlds would be a spreadsheet program that let you run code in a small programming language in each cell, rather than using Excel-style formulas. But then again most programmers would rather just write a script than faff about with a spreadsheet at all, so that's a cool idea that will probably never happen.
uncontrollable freak sardine salad chef

US 89

According to Excel, the time between 12:42:02pm and 12:42:06pm yesterday is equal to 12:00:04 AM on January 0th, 1900.

Took like 30 minutes of texting people and hunting around the various options to figure how to resolve that one. Apparently you have to change the format to "General" (not an easy setting to find) and then multiply by 86400 because it gives you a number in hours as default.

Dirt Roads

There's been a lot of discussion about issues resolving parentheses in MS Excel, but one of the features of the =IF function is that each parenthesis gets color-coded every time you type one.  I have to watch carefully, because I tend to "type" in the "function bar" located at the top of the spreadsheet.  The real function is being entered in the cell itself.  So if you get confused with how many open and closed parentheses are inside of your formula, simply type a closed parenthesis and it will come up bold and in the same color as the open one.  When you finish correctly, the first open and the last closed parentheses will come up in bold black (and then you know that you are done).

=IF(A3>B3,IF(A3>C3,IF(A3>D3,IF(A3>E3,"Giant"),"Huge"),"Big"),"Small")

Don't get confused with the colors, because the cell tags "A3", "B3", "C3", "D3", "E3" are also color-coded with the same colors, and any correlation with the parenthesis color is a coincidence.

By the way, I tend to fill text cells with multiple lines of text where each one is separated by an old-fashioned carriage return.  Depending on which spreadsheet that I'm working with, I've got my function bar expanded to be able to view two or three lines of text at a time.  It makes editing large amounts of data and text much more like working in a miniature text editor, but it distracts me away from some of the nifty stuff that goes on in the cell itself when you are entering a function.

kalvado

Quote from: Dirt Roads on November 11, 2022, 09:42:27 AM
There's been a lot of discussion about issues resolving parentheses in MS Excel, but one of the features of the =IF function is that each parenthesis gets color-coded every time you type one.  I have to watch carefully, because I tend to "type" in the "function bar" located at the top of the spreadsheet.  The real function is being entered in the cell itself.  So if you get confused with how many open and closed parentheses are inside of your formula, simply type a closed parenthesis and it will come up bold and in the same color as the open one.  When you finish correctly, the first open and the last closed parentheses will come up in bold black (and then you know that you are done).

=IF(A3>B3,IF(A3>C3,IF(A3>D3,IF(A3>E3,"Giant"),"Huge"),"Big"),"Small")

Don't get confused with the colors, because the cell tags "A3", "B3", "C3", "D3", "E3" are also color-coded with the same colors, and any correlation with the parenthesis color is a coincidence.

By the way, I tend to fill text cells with multiple lines of text where each one is separated by an old-fashioned carriage return.  Depending on which spreadsheet that I'm working with, I've got my function bar expanded to be able to view two or three lines of text at a time.  It makes editing large amounts of data and text much more like working in a miniature text editor, but it distracts me away from some of the nifty stuff that goes on in the cell itself when you are entering a function.
Yes, you can make things more readable. At some point, though, it is easy to get lost in the code.
That is true for excel, that is true for any other language I dealt with. There are ways to avoid that. There are users and programmers who neglect those approaches. There are companies which neglect that (I am thinking a major car maker as a great example of such mess).
There are many syntax highlight tools, syntax analysis tools. There are still people who get lost.
Welcome to computer world...   

kphoger

Quote from: US 89 on November 11, 2022, 12:42:52 AM
According to Excel, the time between 12:42:02pm and 12:42:06pm yesterday is equal to 12:00:04 AM on January 0th, 1900.

Took like 30 minutes of texting people and hunting around the various options to figure how to resolve that one. Apparently you have to change the format to "General" (not an easy setting to find) and then multiply by 86400 because it gives you a number in hours as default.

Huh.  I find the difference between dates in Excel all the time at work (multiple times every day), and all I have to do is format the cells to "Number" with zero decimal places;  then I know how many days off they are from each other.  Every so often, I also track e-mail lag in Excel down to the minute, which means I need to multiply by 1440.  I've never needed accuracy down to the second, so I never thought about what that factor would be.

He Is Already Here! Let's Go, Flamingo!
Dost thou understand the graveness of the circumstances?
Deut 23:13
Male pronouns, please.

Quote from: PKDIf you can control the meaning of words, you can control the people who must use them.

CtrlAltDel

Quote from: kphoger on November 11, 2022, 11:02:43 AM
Quote from: US 89 on November 11, 2022, 12:42:52 AM
According to Excel, the time between 12:42:02pm and 12:42:06pm yesterday is equal to 12:00:04 AM on January 0th, 1900.

Took like 30 minutes of texting people and hunting around the various options to figure how to resolve that one. Apparently you have to change the format to "General" (not an easy setting to find) and then multiply by 86400 because it gives you a number in hours as default.

Huh.  I find the difference between dates in Excel all the time at work (multiple times every day), and all I have to do is format the cells to "Number" with zero decimal places;  then I know how many days off they are from each other.  Every so often, I also track e-mail lag in Excel down to the minute, which means I need to multiply by 1440.  I've never needed accuracy down to the second, so I never thought about what that factor would be.

86,400. I forget why I needed to know that, but it's one of those things burned into my head. Also, that there are 63,360 inches in a mile (it's 360 backwards concatenated with 360). And of course, there are 525,600 minutes in a nonleap year.
I-290   I-294   I-55   (I-74)   (I-72)   I-40   I-30   US-59   US-190   TX-30   TX-6

CtrlAltDel

Quote from: US 89 on November 11, 2022, 12:42:52 AM
According to Excel, the time between 12:42:02pm and 12:42:06pm yesterday is equal to 12:00:04 AM on January 0th, 1900.

Took like 30 minutes of texting people and hunting around the various options to figure how to resolve that one. Apparently you have to change the format to "General" (not an easy setting to find) and then multiply by 86400 because it gives you a number in hours as default.

I'm not sure what you're trying to say here. In Excel, times are stored internally as fractions of a day. So, 12:42:02 pm is 0.529189814814815 and 12:42:06 pm is 0.529236111111111, while 12:00:04 am is 0.000046296296296. Now, there is some error involved here, since the fractions will only rarely be exact, but it won't be off by 12 hours.
I-290   I-294   I-55   (I-74)   (I-72)   I-40   I-30   US-59   US-190   TX-30   TX-6

hotdogPi

Quote from: CtrlAltDel on November 11, 2022, 11:44:11 AM
Quote from: US 89 on November 11, 2022, 12:42:52 AM
According to Excel, the time between 12:42:02pm and 12:42:06pm yesterday is equal to 12:00:04 AM on January 0th, 1900.

Took like 30 minutes of texting people and hunting around the various options to figure how to resolve that one. Apparently you have to change the format to "General" (not an easy setting to find) and then multiply by 86400 because it gives you a number in hours as default.

I'm not sure what you're trying to say here. In Excel, times are stored internally as fractions of a day. So, 12:42:02 pm is 0.529189814814815 and 12:42:06 pm is 0.529236111111111, while 12:00:04 am is 0.000046296296296. Now, there is some error involved here, since the fractions will only rarely be exact, but it won't be off by 12 hours.

The "time between 12:42:02pm and 12:42:06pm" is 4 seconds, which corresponds to 4 seconds after midnight as if it can't tell a time from a duration.
Clinched

Traveled, plus
US 13, 50
MA 10,22,35,40,53,63,79,107,109,126,138,141,151,159
NH 27,78,111A; CA 90; NY 9A,40,366,423; RI 117; CT 32,193,320; VT 2A,5A; PA 3,51,60, GA 42,140; FL A1A,7; WA 202; QC 162,165,263; 🇬🇧A100,A3211,A3213,A3215; 🇫🇷95 D316

New traveled: I-80, I-287, I-684, CT 72, GSP