Saturday, April 18, 2026

YAGNI

YAGNI ("You Ain't Gonna Need It") is an Extreme Programming (XP) principle stating that functionality should only be added when it is actually needed, rather than when it is foreseen. It prevents overengineering and reduces technical debt by avoiding the creation of unnecessary, complex features that often go unused. 

source: AI overview of Wikipedia

Key Aspects of YAGNI:

  • Benefits: Reduces the cost of building, testing, and maintaining unnecessary code. It also minimizes the "cost of delay" for actual, needed features.
  • Application: It is commonly used in Agile development to maintain a clean codebase and focus on current, tangible value.
  • Drawbacks/Risks: If taken too far, it can lead to a lack of necessary architectural planning, making future refactoring difficult. It is not an excuse for ignoring security or essential design.

Example: Instead of building a complex, generic search algorithm for every conceivable scenario, you only implement the specific filtering required by the user right now. 

source: AI overview from martinfowler.com

When to Apply YAGNI:

  • When tempted to write "future-proof" code.
  • When features are proposed based on speculative, unverified future needs.
  • When complex abstractions are added prematurely. 

source: AI overview of Reddit


Saturday, April 11, 2026

eight wastes (lean): TIM WOODS

TIM WOODS

Transportation

Moving stuff or information more often or further than necessary

Inventory

Extra or unnecessary stuff or information; multiple versions

Motion

Poor ergonomics; poor layout; cruft; manual/repeated data entry

Waiting

Batch processing; insufficient training/staffing/processes/capacity

Overproduction

Extra stuff is created; poor forecasting; inappropriate performance measures

Over-processing

More is done than necessary; misunderstanding customer requirements or quality standards; providing too much detail

Defects

Rework; ineffective detection of process failures; poor training/design/documentation; incorrect/incomplete input data/information

Skills

Ineffective organizational management structure/culture; risk aversion; not using a person's skill; delegating to someone unskilled; staff blocked from their task (not empowered)

Saturday, April 4, 2026

SIPOC

A SIPOC diagram is a high-level process map that outlines the following:

  • Suppliers: The individuals, departments, or entities that provide the necessary inputs for the process.
  • Inputs: The resources, materials, or information required for the process to function properly.
  • Process: The series of steps or activities that transform the inputs into outputs.
  • Outputs: The end products, services, or deliverables generated by the process.
  • Customers: The recipients or beneficiaries of the outputs, whether internal or external to the organization

source: https://www.6sigma.us/process-mapping/sipoc-six-sigma/

Steps to create a SIPOCr diagram/chart.

  1. Process: list the 5-7 main steps of the process. Include when the process starts and ends.
  2. Outputs: list the outputs
  3. Customers: list the customer for each output
  4. Requirements: list the requirements for each output
  5. Input: list inputs required to run the process from start to finish
  6. Support: list who provides each input

Tuesday, March 24, 2026

vibe coding

Reportedly, the first occurrence of the term "vibe coding" is attributed to Andrej Karpathy, one of the founders of OpenAI, on X in early 2025.

Andrej Karpathy @karpathy Feb 2, 2025

There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. ... I ask for the dumbest things like "decrease the padding on the sidebar by half" because I'm too lazy to find it. I "Accept All" always, I don't read the diffs anymore. When I get error messages I just copy paste them in with no comment, usually that fixes it. The code grows beyond my usual comprehension, I'd have to really read through it for a while. Sometimes the LLMs can't fix a bug so I just work around it or ask for random changes until it goes away. It's not too bad for throwaway weekend projects, but still quite amusing. I'm building a project or webapp, but it's not really coding - I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works.

source: https://x.com/karpathy/status/1886192184808149383

I don't know Karpathy, but I do know people who use the approach "can't fix a bug so I just work around it until it goes away". I also know people who stop after "it mostly works". Now for "throwaway weekend projects", that's fine--as long as it's thrown away. I do take offense that "webapp" is not really coding. If anything is not real coding, it's "vibe coding".

But I've been thinking about what we do with the code generated by AI and comparing it to older methods of improving programming. Programming languages have been described using generations. The following definitions are adapted from Wikipedia. In early 2026, the generative-AI LLMs write 3GL languages. The 3GL code becomes the persisted, maintained content. Whereas the 3GL language compiles down to a 2GL and ultimately a 1GL language. The 2GL and 1GL content (e.g., exe, lib, obj) is not maintained. It is always regenerated from the higher-level language. At some point, I expect AI prompts and skills to become a 6GL written in natural language (like COBOL hoped to be).

First generation (1GL)

A first-generation programming language (1GL) is a machine-level programming language. The instructions in 1GL are expressed in binary, represented as 1s and 0s. 

Second generation (2GL)

Second-generation programming language (2GL) is a generational way to categorize assembly languages.

Third generation (3GL)

Examples: C, C++, Java, Python, PHP, Perl, C#, BASIC, Fortran, COBOL

3GLs are much more machine-independent (portable) and more programmer-friendly. 3GLs are more abstract than previous generations of languages, and thus can be considered higher-level languages than their first- and second-generation counterparts. Most 3GLs support structured programming. Many support object-oriented programming. Traits like these are more often used to describe a language rather than just being a 3GL.

Fourth generation (4GL)

Examples: Unix shell, SQL, Oracle Reports, R

Fourth-generation languages tend to be specialized toward very specific programming domains. 4GLs may include support for database management, report generation, mathematical optimization, GUI development, or web development.

Fifth generation (5GL)

Examples: Prolog, OPS5, Mercury, ICAD, Geometry Expert, LISP

A fifth-generation programming language (5GL) is any programming language based on problem-solving using constraints given to the program, rather than using an algorithm written by a programmer. They may use artificial intelligence techniques to solve problems in this way. Most constraint-based and logic programming languages and some other declarative languages are fifth-generation languages. Fifth-generation languages are used mainly in artificial intelligence or AI research

Saturday, March 14, 2026

Simpson's Paradox

Simpson's Paradox is a statistical phenomenon that occurs when you combine subgroups into one group. The process of aggregating data can cause the apparent direction and strength of the relationship between two variables to change.

source: https://statisticsbyjim.com/basics/simpsons-paradox/

source: https://www.tjanesky.com/post/simpsons-paradox


See the TWOTW articles on clustering.

Saturday, March 7, 2026

Semantic Versioning (SemVer)

Semantic versioning (also known as SemVer) is a universal way of versioning the software development projects to track what is going on with the software as versions are being built almost every day. In brief, it's a way for numbering the software releases.

So, SemVer is in the form of Major.Minor.Patch. 

Semantic Versioning is a 3-component number in the format of X.Y.Z, where :  

  • X stands for a major version. The leftmost number denotes a major version. When you increase the major version number, you increase it by one but you reset both patch version and minor versions to zero. If the current version is 2.6.9 then the next upgrade for a major version will be 3.0.0. Increase the value of X when breaking the existing API.
  • Y stands for a minor version. It is used for the release of new functionality in the system. When you increase the minor version, you increase it by one but you must reset the patch version to zero. If the current version is 2.6.9 then the next upgrade for a minor version will be 2.7.0. Increase the value of Y when implementing new features in a backward-compatible way.
  • Z stands for a Patch Versions: Versions for patches are used for bug fixes. There are no functionality changes in the patch version upgrades. If the current version is 2.6.9 then the next version for a patch upgrade will be 2.6.10. There is no limit to these numbers. Increase the value of Z when fixing bugs

Points to keep in mind : 

  • The first version starts at 0.1.0 and not at 0.0.1, as no bug fixes have taken place, rather we start with a set of features as the first draft of the project.
  • Before 1.0.0 is only the Development Phase, where you focus on getting stuff done. This stage is for developers in which the system is being developed.
  • SemVer does not cover libraries tagged 0.*.*. The first stable version is 1.0.0.

source: https://www.geeksforgeeks.org/software-engineering/introduction-semantic-versioning/

Given a version number MAJOR.MINOR.PATCH, increment the:

  1. MAJOR version when you make incompatible API changes
  2. MINOR version when you add functionality in a backward compatible manner
  3. PATCH version when you make backward compatible bug fixes

Additional labels for pre-release and build metadata are available as extensions to the MAJOR.MINOR.PATCH format.

source: https://semver.org/

Saturday, February 28, 2026

PBKDF (Password Based Key Derivation Function)

PBKDF2, defined in RFC 2898, is a specific Key Derivation Function (KDF). A KDF is simply any mechanism for taking a password (something a user remembers or stores in a password manager) and turning it into a symmetric key suitable for cryptographic operations (i.e., AES).

PBKDF2 takes as input a password, a salt (see TWOTW Salt and Pepper), an integer defining how many “iterations” of the hash function to undergo, and an integer describing the desired key length for the output.

source: https://www.ssltrust.com/blog/pbkdf2-password-key-derivation

PBKDF2 applies a pseudorandom function, such as hash-based message authentication code (HMAC), to the input password or passphrase along with a salt value and repeats the process many times to produce a derived key, which can then be used as a cryptographic key in subsequent operations. The added computational work makes password cracking much more difficult, and is known as key stretching (see TWOTW Key Stretching).

In 2023, OWASP recommended to use 600,000 iterations for PBKDF2-HMAC-SHA256 and 210,000 for PBKDF2-HMAC-SHA512.

Algorithmic representation of the iterative process of PBKDF2.

Having a salt added to the password reduces the ability to use precomputed hashes (rainbow tables) (see TWOTW Rainbow Tables) for attacks, and means that multiple passwords have to be tested individually, not all at once. The public key cryptography standard recommends a salt length of at least 64 bits. The US National Institute of Standards and Technology recommends a salt length of at least 128 bits.

source: https://en.wikipedia.org/wiki/PBKDF2

Further reading: Password Storage Cheat Sheet

Saturday, February 21, 2026

Key Stretching (cryptography)

Key stretching techniques are used to make a possibly weak key, typically a password or passphrase, more secure against a brute-force attack by increasing the resources (time and possibly space) it takes to test each possible key. Passwords or passphrases created by humans are often short or predictable enough to allow password cracking, and key stretching is intended to make such attacks more difficult by complicating a basic step of trying a single password candidate. 

There are several ways to perform key stretching. One way is to apply a cryptographic hash function or a block cipher repeatedly in a loop. For example, in applications where the key is used for a cipher, the key schedule in the cipher may be modified so that it takes a specific length of time to perform. Another way is to use cryptographic hash functions that have large memory requirements.

Modern password-based key derivation functions, such as PBKDF2 (see TWOTW PBKDF), use a cryptographic hash, such as SHA-2, a longer salt (e.g. 64 bits) and a high iteration count. The U.S. National Institute of Standards and Technology (NIST) recommends a minimum iteration count of 10,000.

In 2013, a Password Hashing Competition was held to select an improved key stretching standard that would resist attacks from graphics processors and special purpose hardware. The winner, Argon2, was selected on July 1, 2015.

source: https://en.wikipedia.org/wiki/Key_stretching

Further reading: Password Storage Cheat Sheet

Saturday, February 14, 2026

Salt and Pepper (password cryptography)

Salt

A “salt” is a random string that is added to a password before it undergoes the hashing process. The primary purpose of salting is to add uniqueness to each hashed password, even when two users have identical passwords.

Pepper

Pepper is a secret value added to the password before encryption. But pepper is not stored with user records. Instead, the pepper is a fixed value (or a set of values) used across the system. Pepper is kept private and away from the user/password records. Pepper is often hard-coded into the application or stored in a secure configuration file.

source: https://little-fire.com/salt-and-pepper-in-password-cryptography/

Combining the password with the pepper value means that even if the attacker has the hash and the salt, they still won’t have enough information to be able to easily get the original password back out. 

It’s impossible to change a pepper value without forcing every user to reset their password.

source: https://www.baeldung.com/cs/password-salt-pepper

Further reading: Password Storage Cheat Sheet

Saturday, February 7, 2026

Rainbow Table

Rainbow Tables are commonly confused with another, simpler technique that leverages a compute time-storage tradeoff in password recover: hash tables.

Hash tables are constructed by hashing each word in a password dictionary. The password-hash pairs are stored in a table, sorted by hash value. To use a hash table, simple take the hash and perform a binary search in the table to find the original password, if it's present.

Rainbow Tables are more complex. Constructing a rainbow table requires two things: a hashing function and a reduction function. The hashing function for a given set of Rainbow Tables must match the hashed password you want to recover. The reduction function must transform a hash into something usable as a password. A simple reduction function is to Base64 encode the hash, then truncate it to a certain number of characters.

Rainbow tables are constructed of "chains" of a certain length: 100,000 for example. To construct the chain, pick a random seed value. Then apply the hashing and reduction functions to this seed, and its output, and continue iterating 100,000 times. Only the seed and final value are stored. Repeat this process to create as many chains as desired.

To recover a password using Rainbow Tables, the password hash undergoes the above process for the same length: in this case 100,000 but each link in the chain is retained. Each link in the chain is compared with the final value of each chain. If there is a match, the chain can be reconstructed, keeping both the output of each hashing function and the output of each reduction function. That reconstructed chain will contain the hash of the password in question as well as the password that produced it.

source: Answer by Crunge on security.stackexchange.com

See also: How Rainbow Tables work 

See also: https://en.wikipedia.org/wiki/Rainbow_table

Further reading: Password Storage Cheat Sheet