In the case of LICM, I did things in a few steps: copy, patch, garbage collect.
Each invariant instruction is copied to the pre-header "verbatim", where the instructions are all new, but equivalent to their originals.
Each copied instruction is then patched, so that if an instruction defines a temporary register, then I replace it with a new temporary register, and replace any use (can only be one) in the copied instructions with the new temporary register. This is guaranteed to be safe because a use of a temp can only be moved to the preheader iff the def of the temp is loop-invariant.
Finally, I garbage collect the original versions of the invariant instructions by converting every instruction that doesn't define a temporary variable into a NOP. If the instruction defines a temporary then I leave it for DCE. This is because one might have the case that the definition of a temporary is loop-invariant, but the use is not. In this case, the moved (i.e. copied and patched) definition will be removed by DCE.
Moving a definition of a temp register is likely the cause of that error for you. TL;DR: I avoided that issue by copying and patching instead of moving.
Yes. After EVAL, CP, CF, DCE, and CSE were all re-run in the same way as in the beginning of the pipeline.
That is unusual that CSE before LICM would be a disimprovement. It might be that your CSE pass was causing extra sharing of registers or something, thus making one of the registers not loop-invariant.
The general progression of optimisations for me was:
Good suggestion. I will look into that. I'm used to Subversion and so it's been my go-to tool of choice, but I think that it's about time I switch look into other solutions.
Also, I suppose that privately hosting the subversion repository on my domains isn't the most welcoming to possible future contributors!
For those interested, a good example of a language meant to describe the structure of another language but that cannot describe itself would be regular expressions.
A language describing all regular languages contains all regular expressions; however, the language of all regular expressions is not regular (it is context-free, to see this, consider that regular expressions cannot be used to determine whether or not a string has a balanced number of parentheses).
I recently posted an article on the basic algorithm of PEGs (without followed-by and not-followed-by operators) in my most recent blog post titled PEGs and Left Recursion.
The algorithms I have written on the subject interprets a grammar data structure. The interpretation is similar to how a recursive descent parser works, with the exception that the algorithms caches the results of each time a non-terminal of the grammar is interpreted at a particular point in the input stream.
The algorithm written about in the post roughly corresponds to the workings of this implementation of PEGs without the aforementioned operators.
In the past, I've found it easier to implement PEG using this formulation to gain power equivalent to the aforementioned operators.
If you have further questions on this subject, then feel free to ask them in the comments section of my most recent blog post about PEGs.