This portion of the site is dedicated to archiving definitions of interesting, misunderstood, or otherwise noteworthy terms found mostly in infosec, science, philosophy, and IT.
Many of these are in my own words, but those that aren’t are common, dictionary denotations that can be found anywhere. Despite their commonality, I seem to forget many of these often — hence the need for this page.
At the moment, many of the entries here belong in their own section under danielmiessler.com/study. I’ll try and remedy this as soon as possible.
Uuencode
A file is “uuencoded” when it is converted into 7-bith ASCII so that it can be shared with other systems (usually via email). Uuencode originally stood for “Unix to Unix encoding”.
Diffie-Hellman
A key agreement/exchange protocol developed by Diffie and Hellman in 1976. The protocol is designed to allow users to exchange secret information over a public medium. This is done by generating a large prime number, a base number, and a private number and then using them to generate a public number. Public numbers are exchanged between hosts and then used to create a shared number which is unique to both hosts. Diffie-Hellman, however, is vulnerable to man in the middle attacks. This is done by intercepting the initial exchange of public keys. The man in the middle can perform key exchanges with each endpoint and then simply forward communications between them after reading and/or recording their contents. This can be defeated by using an authentication system such as digital certificates.
ASCII
ASCII is a 7-bit character code standardized in the US, the 1967 version being the final one. HTML uses ASCII to transfer data over the web. ASCII is not designed to represent non-english alphabets, and this is a major shortcoming.
There are 8-bit character codes in common use that are identical with ASCII in the first 128 positions, but these are not ASCII. Some common codes are listed below under ISO 8859.
Unicode
The Unicode Standard is the universal character encoding standard used for representation of text for computer processing. It was originally intended to be a 16 bit character set, but it is now seen that 65536 characters are insufficient. As a result, different implementations use different character sizes for ‘native’ Unicode representation. For example, Windows uses 16-bit characters, Linux typically uses 32-bit characters. Nevertheless, there are well-defined standards which permit the orderly interchange of Unicode data.
It should be noted that most modern software systems (e.g., Windows, Java) use Unicode as their exclusive internal text representation.
MIME
Multipurpose Internet Mail Extensions MIME is an Internet standard that specifies how messages must be formatted so that they can be exchanged between different mail systems. MIME allows you to include just about any kind of file in an email message. Some examples include text, images, audio, video, and character sets other than ASCII. SMIME is a more secure version of MIME that allows for the use of encryption when sending files in an email message.
ISO 8859
ISO 8859 is a full series of standardized multilingual single-byte coded (8bit) graphic character sets for writing in alphabetic languages. ISO 8859-1 is commonly used in the United States, and corresponds to the Western European alphabet as per the table below:
Latin1 (West European) Latin2 (East European) Latin3 (South European) Latin4 (North European) Cyrillic Arabic Greek Hebrew Latin5 (Turkish) Latin6 (Nordic)
Hanlon’s Razor
“Never attribute to malice that which can be adequately explained by stupidity.”
Hanlon – as used in the name – is thought to be a modification of “Heinlein, due to the fact that he use a similar phrase in his book “Logic of Empire” in 1941.
Finagle’s Law
This proverb is also referred to as “Finagle’s Law of Dynamic Negatives”, and is simply a “folk” version of Murphy’s Law.
“Anything that can go wrong, will.”
Murphy’s Law
“If there are two or more ways to do something, and one of those ways can result in a catastrophe, then someone will do it.”
Murphy’s law is a priciple of defensive design coined by Edward A. Murphy, Jr., an engineer working on rocket-sleds for the Air Force in 1949. The saying quickly spread and mutated. Currently, the most popular form of the saying is Finagle’s Law – the “folk” version of the original proverb.
Sturgeon’s Law
“Ninety percent of everything is crap.”
This saying is derived from a quote by Theodore Sturgeon, who said:
“Sure, 90% of science fiction is crud. That’s because 90% of everything is crud.”
Ninety-Ninety Rule
“The first 90% of the code accounts for the first 90% of the development time. The remaining 10% of the code accounts for the other 90% of the development time.”
- a quote by Tom Cargill of Bell Labs.
Occam’s Razor
A simple concept attributed to the mediaeval philosopher William of Occam which states that one should not make any more assumptions than the minimum needed.
“One should not increase, beyond what is necessary, the number of entities required to explain anything.”
Also called the “Principle of Parsimony” and the “Principle of Simplicity”.
Cross Site Scripting (XSS)
Cross Site Scripting is a type of vulnerability that works when a user clicks a link containing malicious code, and is most often used to attempt to hijack sessions. This is possible through the use of web sites that use dynamically generated pages. XSS occurs when a web server embeds browser input into the output sent back to a browser – allowing for malicious script to be executed.
Basically, using this technique, an attacker gives script as input to a site, which is then used to generate a page for the victim, and that code then runs on the victim’s machine. This is able to be done through the passing of special characters to the server which are then used in the creation of the dynamic content for its output. The nature of the attack (and the reason for the name) is that the code being executed on the victims machine is trusted to some degree due to it appearing to be coming from a trusted site they are visiting – not the attacker’s site.
backronym: n.
[portmanteau of back + acronym] A word interpreted as an acronym that was not originally so intended. This is a special case of what linguists call back formation. Examples are given under recursive acronym (Cygnus), Acme, and mung. Discovering backronyms is a common form of wordplay among hackers. Compare retcon
recursive acronym: n.
A hackish (and especially MIT) tradition is to choose acronyms/abbreviations that refer humorously to themselves or to other acronyms/abbreviations. The original of the breed may have been TINT (ìTINT Is Not TECOî). The classic examples were two MIT editors called EINE (ìEINE Is Not EMACSî) and ZWEI (ìZWEI Was EINE Initiallyî). More recently, there is a Scheme compiler called LIAR (Liar Imitates Apply Recursively), and GNU (q.v., sense 1) stands for ìGNU’s Not Unix!î ó and a company with the name Cygnus, which expands to ìCygnus, Your GNU Supportî (though Cygnus people say this is a backronym). The GNU recursive acronym may have been patterned on XINU, ìXINU Is Not Unixî ó a particularly nice example because it is a mirror image, a backronym, and a recursive acronym. See also mung, EMACS.
GNU
GNU (pronounced “Guh-Noo”) is a recursive acronymn (see above) that stands for “GNU Not Unix!”. It was designed by Richard Stallman to be a free (as in freedom) replacement for the various Unix operating systems. GNU is indpendent from the kernel that it uses, and the current kernel being used with nearly all implementations is Linux.
Recursion
Recursion is a programming term (here), and it means to call oneself.
Below is an example of a recursive function: int Fact(int x) { if x==1 return 1; else return x*Fact(x-1); } ;
HIRD
“Hurd” stands for “Hird of Unix-Replacing Daemons”. And, then, Hird’
stands for
Hurd of Interfaces Representing Depth’. We have here, to my
knowledge, the first software to be named by a pair of mutually
recursive acronyms.
Big-Endian / Little-Endian
The adjectives big-endian and little-endian refer to which bytes are most significant in multi-byte data types and describe the order in which a sequence of bytes is stored in a computerís memory.
In a big-endian system, the most significant value in the sequence is stored at the lowest storage address (i.e., first). In a little-endian system, the least significant value in the sequence is stored first.
Big-Endian 1025 00 00000000 01 00000000 02 00000100 03 00000001
Little-Endian 1025 00 00000001 01 00000100 02 00000000 03 00000000
The terms big-endian and little-endian are derived from the Lilliputians of Gulliver’s Travels, whose major political issue was whether soft-boiled eggs should be opened on the big side or the little side. Likewise, the big-/little-endian computer debate has much more to do with political issues than technological merits.
NAT
NAT stands for Network Address Translation, and was initially designed to allow for the use of private IP addresses due to a shortage of public IPs available to be assigned. NAT is most commonly used to maintain a network of private addresses behind a single “real” IP address that is live on the Internet.
Network Protocol Numbers
ICMP – 1 IP – 4 TCP – 6 UDP -17 IPv6 – 41 GRE- 47 ESP – 50 AH – 51
TCP Ping
The term “tcp ping” is actually a misnomer. Ping is a nickname for an ICMP echo request, and ICMP is a layer 3 protocol. A “TCP ping” implies that TCP is used, which it is, but TCP is a layer 4 protocol. In actuality, a “tcp ping” is not a ping at all. It is simply a TCP packet destined for a certain port (usually with the ACK flag set).
Upon receiving this type of packet, a host will respond with a TCP packet of its own which will have the RST flag set. When the probing machine sees that packet, it knows that the target is alive, hence the use of the word “ping”. This teqnique works even if the target is blocking ICMP (something that is becoming more and more common), so the only way to block this type of probe is to drop such packets completely.
Free Software
From the GNU Website:
“Free software’’ is a matter of liberty, not price. To understand the concept, you should think of “free’’ as in “free speech,’’ not as in “free beer.’’
When people say, “Free as in beer”, what they are meaning is that it doesn’t cost any money. “Free as in speech” pertains to liberty.
File Descriptor
An integer that describes an open file within a process. The number is created at the time of the file being opened. Anything that reads, writes, or closes a file uses the file descriptor as an input paramater. In Unix, file descriptors 0, 1, and 2 refer to the standard input, standard output, and standard error files respectively.
Perl
Practical Extraction and Report Language. Created by Larry Wall in 1987
CGI
Common Gateway Interface. This is where servers process user input on the server side and return output to the client. CGI scripts are commonly written in Perl.
PHP
Created by Rasmus Lerdorf originally as a Perl CGI script called “Personal Home Page”, or simply “PHP”. The original purpose for the script was to log visitors to his resume page on his website. Like Perl, PHP must be used within HTML in order to work over the web.
SMTP
RFC 821
SMTP is the main protocol used for sending mail on the Internet. Understanding it to at least a moderate degree is a must.
Commands- HELO – Identifies the sending machine. This is spoofable, but many systems are able to look and see if the IP matches the DNS name given here.
MAIL FROM – The sender address given to the mail server, or, in other words, this is the email address that the sender is claiming the message is coming from.
RCPT TO – The address that the message will be going to. Using multiple RCPT TO commands allows you to send to multiple recipients.
DATA – This is the actual meat of the message. There are no controls on what can be sent in this portion. Words at the beginning of a line that are followed by a colon are interpreted as headers by most mail programs. The end of the DATA section is denoted by a period (.) on a line by itself.
QUIT – This is the command that is used to sever the connection to the mail server.
Email Headers
When email moves from one server to another each box appends a new header to the top of the previous one, making a stack of headers. To track who all handled a given message, start at the bottom of the header and move from left to right as you go up.
The “Received” headers are the headers should be reviewed to find out what has actually happened during the course of an email message’s travels. Many of the other header options are subject to forgery and are less reliable as a source of good information about a particular email message.
It is interesting to note that the true recipient(s) of a message are not viewable in an email header. The actual recipient is declared with the RCPT TO: command given to the mail server, but this information is not available in a header. The To: header option is often present in a header, but this can be forged fairly easily.
Header Options-
Content-Transfer-Encoding: – This is the MIME content type for the message in question, and it is what determines what is used on the client to read/interpret the message. This is a header option that can (and has been) used maliciously by claiming the content type is one thing when it is really something else.
Content-Type: – Similar to above
From (no colon) – This is a relatively trustable field that indicates who sent the message.
From: – This is the sender modifyable from field; don’t trust it.
Message-Id:, Message-id:, Message-ID: – This is a fairly unique identifier assigned to each message – usually by the first mail server that touches it.
In-Reply-To: – A USENET header used for tracking what post a given post is in response to. This header option is seldom used outside of USENET.
Priority: – A freeform header option that spammers often use to assign their trash a high level of importance.
Reply-To: – This is the email address that will be the recipient if someone replies to the message in question. Often used by spammers to deflect people’s complaints.
Return-Path: – Same as Reply-To:.
X-Headers – X-Headers are headers that start with “X-” and are for informational use only. Any header that is not standard, and is used for some specific purpose is supposed to use this designation, but this isn’t always the case.
X-Mailer: – This is the X-Header used for identifying the mail client used to send the message.
The Sticky Bit
The sticky bit is a Unix file/directory permissions setting used usually
on publicly accessable directories. Normally, when a directory is
accessible to the public and additional permissions do not interfere, a
user can rename and/or delete files belonging to other users within that
directory. The sticky bit prevents this from happening. The permission
is set by specifying chmod 1777
octally, or by chmod +t
symbolically.
The Setuid Bit
Executable files with this bit set will run with effective uid set to the uid of the file owner. This means that if root creates a script and makes it setuid, whenever it’s run it’ll be run as root rather than as the user that ran the script. This is highly dangerous and should be avoided whenever possible.
Arthur C. Clark’s
- When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
- The only way of discovering the limits of the possible is to venture a little way past them into the impossible.
- Any sufficiently advanced technology is indistinguishable from magic.
Logarithms
A logarithm is a math term strongly tied to the concept of exponents. In
the equation xy = z
, we can now pretend that two of the variables are
given, and solve for the third.
If the base and the exponent are given we compute a power, if the the exponent and the power are given we compute a root, and if the power and the base are given, we compute a logarithm.
Examples:
- 10^2^ = 100 log~10~100 = 2
- 10^-2^ = 0.01 log~10~0.01 = -2
- 10^0^ = 1 log~10~1 = 0
- 2^3^ = 8 log~2~8 = 3
- 3^2^ = 9 log~3~9 = 2
- 25^1/2^ = 5 log~25~5 = 1/2
In short, finding a logarithm is finding the exponent when you know the base and the power.