My mistakes
when dealing
with legacy code

Dawid Mazur / @dwdmzr
"Experienced" Backend Developer @ Clearcode

Who am I

  • A developer
  • Experience with leading teams and projects
  • An active community member
  • Teaching kids to code

And a lot of experience with legacy code

About the slides

What even is legacy code?

Legacy code is old code

Legacy code is old code

You could be writing it today

Legacy code is bad code

Legacy code is bad code

Legacy code is successful code

Legacy code is not maintainable

Why talk about legacy code?

Because it's hard

Because it's challenging

Because it's gonna happen to you

Or is it?

It's an amazing learning opportunity!

My mistakes
when dealing
with legacy code

Dawid Mazur / @dwdmzr
Pizza appreciation expert @ Clearcode

Mistake #1:
Writing "documentation"

The problem: Legacy system written in...

PHP

PHP 4

Disclaimer

Our solution: the documentation

Examples

  • The naming of variables uses abbreviations, here are some we deciphered...
  • These variables are globals. NEVER FU***N TOUCH THEM.
  • We don't know what this module does, but removing it breaks everything DO NOT FU***N TOUCH!
  • Remember variable names? Wait till you see the database tables xD
  • WELCOME TO HELL \m/

8:00

16:00

Why was that bad?

Documentation is ok
but it's only a part of the solution

We were only making
the legacy codebase bigger
by not making anything better

We should have implemented
unit testing and refactoring

Mistake #2:
"Let's write it again!"

The problem:
Custom company framework

Fun facts about that framework

  • "MVC" architecture
  • SQL queries were built using STRINGS
  • You could erase the whole DB table accidentally
  • Funniest e-commerce module I ever saw
  • NOT A SINGLE TEST

Our solution:
Custom company framework
2.0

Why was that bad?

Fun facts about that 2.0 framework

  • "MVC" architecture
  • SQL queries were handled by ORM \o/
  • You couldn't erase data accidentally
  • What e-commerce module?
  • Some tests, but mostly outdated

Was that a win? NOPE

Most of the time
it's better to fix
than to start from scratch

We should have implemented
unit testing and refactoring

So... why unit testing and refactoring?

Your code is already tested,
and by the worst kind of testers
the end-users

but, you have to make changes

changes = regression

software regression a software bug that makes a feature stop functioning as intended after a certain event (for example, a system upgrade, system patching)

Regression is not that big of a deal,
when the code is tested correctly

But we're talking legacy

and if it's not tested already,
it's probably not that testable

"I won't need to change anything"

  • Standards change
  • Dependencies deprecate
  • Security is getting harder

So...

How to
legacy code

Dawid Mazur / @dwdmzr
Cringe artist @ Clearcode

Step 0: Gear up

Good IDE and a debugger
are an absolute necessity!

Legacy code burnout
is real!

Automatic refactoring

Static analysis is awesome!

Step 1: Don't add more legacy code

Use TDD

Separate the new codebase

Step 2: Improve existing code

Mission: make code maintanable

How? Refactoring

refactoring (n.). A change made to the internal structure of software to make it easier to understand and cheaper to modify without changing its existing behavior.

"Maybe I'll just refactor without testing?"

Never refactor
without tests

Tests = The software vise

What should you test?

Ideally everything touching the refactor

(╯°□°)╯︵ ┻━┻

What should I test?

  1. Characterization tests
  2. Reasoning about effects
  3. Effect propagation
  4. "One level back" testing

The example


class OrderProcessor extends AbstractProcessor {
	public function __construct()
	{
		$this->orderList = [];
	}
	// (MANY MANY LOC)

	public function addOrders($orders) : void {
		foreach($orders as $order) {
			$order->setStatus(Order::STATUS_PROCESSING);
		}

		array_merge($this->orderList, $orders);
	}
	// (MANY MANY LOC)
}
					

1. Characterization tests

What do i think the code does?


public function addOrders($orders) : void {
	foreach($orders as $order) {
		$order->setStatus(Order::STATUS_PROCESSING);
	}

	array_merge($this->orderList, $orders);
}
				

2. Reasoning about effects


public function addOrders($orders) : void {
	foreach($orders as $order) {
		$order->setStatus(Order::STATUS_PROCESSING);
	}

	array_merge($this->orderList, $orders);
}
				

public function addOrders($orders) : void {
	foreach($orders as $order) {
		$order->setStatus(Order::STATUS_PROCESSING);
	}

	array_merge($this->orderList, $orders);
}
				

3. Effect propagation

  • Global objects, hidden "seams"
  • Subclasses
  • Superclasses even (do not break LSP)

4. "one level back" testing

Now we know where to test...

Time to bring the code into
a test harness

(Easier said than done)

I want to test, so I can refactor
but I have to refactor, so I can test

Challenges

  • Dependencies
  • Side effects
  • Globals
  • Monster methods / god objects

Time for a 4 (developers)2 survey™

We have a lot of tools
at our disposal

  • Useful design patterns
  • Variable sensing
  • Break Out Method Object
  • Scratch refactoring
  • and a lot more!

Sprout method


public function addOrders($orders) : void {
	foreach($orders as $order) {
		$order->setStatus(Order::STATUS_PROCESSING);
	}

	array_merge($this->orderList, $orders);
}
                    

Sprout method


public function addOrders($orders) : void {
	$orders = $this->filterUniqueAndProcessed($orders);

	foreach($orders as $order) {
		$order->setStatus(Order::STATUS_PROCESSING);
	}

	array_merge($this->orderList, $orders);
}
                    

What do we get

  • Clear separation
  • We can test the new code only
  • Trouble testing? Try static or...

Sprout class


public function addOrders($orders) : void {
	$filter = new ProcessedOrderFilter();
	$orders = $filter->filter($orders);

	foreach($orders as $order) {
		$order->setStatus(Order::STATUS_PROCESSING);
	}

	array_merge($this->orderList, $orders);
}
                    

Even better separation: Wrap method


private function addToProcessing(array $orders) {
	foreach($orders as $order) {
		$order->setStatus(Order::STATUS_PROCESSING);
	}

	array_merge($this->orderList, $orders);
}

public function addOrders($orders) : void {
	$this->log($orders);
	$this->addToProcessing($orders);
}
                    

Wrap class


class LoggedOrderProcessor extends OrderProcessor {
    private function log(array $orders) {}

    public function addOrders($orders) : void {
        $this->log($orders);
        parent::addOrders($orders);
    }
}
                    

Or you can use
the decorator pattern


class LoggingProcessorDecorator implements ProcessorInterface {
    private $processor;

    private function log(array $orders) {}

    public function __construct(ProcessorInterface $processor)
    {
        $this->processor = $processor;
    }

    public function addOrders($orders) : void {
        $this->log($orders);
        $this->processor->addOrders($orders);
    }
}
                    

$processor = new LoggingProcessorDecorator(OrderProcessor());
                    

Testing monster methods

How did the author test this lib?

Variable sensing

Variable sensing


public $seekIdPresentInSegment = false;

/**
 * @param array $info
 */
private function parseEBML(&$info) {
	$this->seekIdPresentInSegment = false;
	// http://www.matroska.org/technical/specs/index.html#EBMLBasics
	$this->current_offset = $info['avdataoffset'];

					

Variable sensing


while ($this->getEBMLelement($sub_seek_entry, $seek_entry['end'], true)) {
	switch ($sub_seek_entry['id']) {
		case EBML_ID_SEEKID:
			$this->seekIdPresentInSegment = true;
			$seek_entry['target_id']   = self::EBML2Int($sub_seek_entry['data']);
			$seek_entry['target_name'] = self::EBMLidName($seek_entry['target_id']);
			break;
		case EBML_ID_SEEKPOSITION:
			$seek_entry['target_offset'] = $element_data['offset'] + getid3_lib::BigEndian2Int($sub_seek_entry['data']);
			break;
		default:
			$this->unhandledElement('seekhead.seek', __LINE__, $sub_seek_entry);												}
			break;
	}
					

Variable sensing


use PHPUnit\Framework\TestCase;

class getid3_matroskaTest extends TestCase
{
    public function testPushAndPop()
    {
        $id3 = new getID3();
        $id3->openfile('./test/matroska_test_w1_1/test1.mkv');
        $sut = new getid3_matroska($id3);
        $sut->Analyze();

        $this->assertTrue($sut->seekIdPresentInSegment);
    }
}
					

Success!


➜  ./vendor/bin/phpunit test
PHPUnit 8.3.5 by Sebastian Bergmann and contributors.

.                                  1 / 1 (100%)

Time: 137 ms, Memory: 6.00 MB

OK (1 test, 1 assertion)

					

Break Out Method Object


class OrderProcessor {
	public function addOrders(
		OrderCollection $orders,
		int $capacity = 10) : void {
		(...)
	}
}
					

Break Out Method Object


class OrderProcessingScheduler {
	public __construct(
		OrderProcessor $processor,
		OrderCollection $orders,
		int $capacity = 10
	) {
		$this->processor = $processor;
		$this->orders = $orders;
		$this->capacity = $capacity;
		(...)
	}

	public addOrders() : void {}
}
					

Break Out Method Object


class OrderProcessor extends OrderProcessorInterface {}
					

Break Out Method Object


class OrderProcessingScheduler {
	public __construct(
		OrderProcessorInterface $processor,
		OrderCollection $orders,
		int $capacity = 10
	) {
		(...)
	}
}
					

Break Out Method Object


class OrderProcessor {
	public function addOrders(OrderCollection $orders, int $capacity = 10) : void {
		$scheduler = new OrderProcessingScheduler(
			$this,
			$orders,
			$capacity
		);
		$scheduler->addOrders();
	}
}
					

Bonus: Scratch refactoring

And we haven't really gotten to refactoring yet ;)

Recommended reading

Takeaways

  • Working with legacy code is good for you!
  • Rewriting the whole project almost never works
  • Resist the urge to just refactor without testing
  • There are many ways to combat legacy code
  • Be disciplined and systematic, and the progress will show
  • It's going to be ok ;)

Good luck!

Feedback is very welcome!

Thank you!

Stay safe and healthy out there :)
Questions?
@dwdmzr | dwdmzr | dwd.mazur


Check out IT Depends
Code with Coding Dojo Silesia
Get inspired by art.exe

The slides are at dawidmazur.eu/legacy-code