The reason can be inferred from the difference between
-
NUL : No character. It is used for filling in time or filling space on the surface (such as surface of platter) of storage device where there are no data. We’ll use this character when we’ll be doing programming for data wipers (destructive and non-destructive both) to wipeout the unallocated space so that deleted data may not be recovered by any one or by any program.
-
SOH : This character is used to indicate the start of heading, which may contain address or routing information.
-
STX : This character is used to indicate the start of text and in this way this is also used to indicate the end of the heading.
Source of Definition
Various comments have already pointed out that these characters (first 32) are called the control characters because these characters perform various printer and display control operations rather than displaying symbols.
In Hadoop, SOH is often used as field separator especially while reading CSV files. Also, upon opening the Excel file (without unzipping as plain xlsx i.e. as binary) in Notepad++ and printing the control characters, one will see a lot of SOH characters. Due to the security features, one won’t be able to see the actual data.
chr(0) = SOH is a pointer to the Memory address of all the cells containing TEXT
and
chr(1) = NUL is a pointer to the Memory address of all the cells containing NULL or BLANK CELLS
Let’s look at the above statements in purview to the different commands shown in the image below. The TOP-LEFT is the original data range and example shows that how replacement works for SOH, NUL and ASTERISK (*) using the Range.Replace function.
So, why Chr(0) & Chr(1) are able to replace the values in the Excel Cell if they are merely address pointers and how chr(1) is acting differently than chr(42)?
This happens because of the structure of an Excel Worksheet. When you open a brand new Excel Workbook, the Excel Cells are not marked. It’s completely visually and in hard and soft memory of the system.
However, As soon a one enters something in Cell B2 then all the BLANK CELLS squared upto B2 will be assigned a pointer with Chr(0) and Cell B2 will be assigned a pointer with Chr(1).
Now, if one deletes the content of cell B2 then along with other cells B2 will be assigned a NUL or Chr(0) pointer to it.
This is also the reason that why merely clearing the contents using Cells.ClearContents won’t clear the cells in the memory and Excel workbook bloats when a far distant cell was used but not deleted using Cells.Clear or Ctrl+Delete.
Now, while it is clear that why Excel replaces the contents or clears the worksheet with the given command. It’s also, important to understand the difference between asterisk (*) and SOH [or Chr(0)].
Asterisk (*) replaces the contents of cells with minimum length 0.
While Chr(0) identifies the cells with SOH i.e. the cells containing value and then the replace the value at the destination of pointer.
The difference is not in result but how the cells to be replaced are find or identified.
I created a small macro to validate the hypotheses in which I am recording the time it takes to replace half a million cells (A1 to CZ5000) using Chr(1) = SOH and Chr(42) = *
It turns out that using the pointer to address of cells with TEXT i.e. SOH takes less time than ASTERISK (*).
Sub Replace_Timer()
Sheet3.Activate
Dim startTime, endTime
Dim i As Integer
Sheet3.Range("A1:CZ5000").Value = "A"
For i = 1 To 25
startTime = Timer
Sheet3.Cells.Replace Chr(1), "X", xlPart, xlRows, False
endTime = Timer
Sheet4.Cells(i + 1, 2).Value = endTime - startTime 'Execution time in miliseconds
startTime = Timer
Sheet3.Cells.Replace Chr(42), "Z", xlPart, xlRows, False
endTime = Timer
Sheet4.Cells(i + 1, 3).Value = endTime - startTime 'Execution time in miliseconds
Next
Sheet4.Activate
End Sub