After 2 failed attempts, that were disproved by @Hendrik Jan (thank you), here is another one, that is not more successful. @Vor found an example of a deterministic CF language where
the same construction would apply, if correct. This allowed identifying an error in the anchoring of the string in the application of the lemma. The lemma itself does not seem at fault. This is clearly too simplistic a construction. See more details in the comments.
The language is not Context-Free.
anywhere between the two 1's. We want to pump that string on the first part between the 1's, so that it will become which is not supposed to be in the language.
We first try to use Ogden's lemma, which is like the pumping lemma, but applies to or more distinguished symbols that are marked on the string, being the pumping length for marked symbols (but the lemma can pump more because it can pump also unmarked symbols). The pumping marked-length depends only on the language. This attempt will fail, but the failure will be a hint.
We can then choose and we mark symbols on the first sequence of 0's.
We know that none of the two 1's will be in the pump, because it can pump out once (exponent 0) instead of pumping in. And pumping out the 1's would get us out of the language.
However, we could be pumping on both sides of the second 1 as fast or even faster on the right side, so that the second 1 would never get across the middle of the string. Also Ogden's lemma does not fix an upper limit to the size of what is being pumped, so that it is not possible to organize the pumping to get the rightmost 1 exactly across the middle of the string.
We use a modified version of the lemma, here called Nash's Lemma, which can handle these difficulties.
We first need a definition (it probably has another name in the literature, but I do not know which - help is welcome). A string is said to be an erasure of a string iff it is obtained from by erasing symbols in . We will note .
Nash's Lemma :
If is a context-free language, then there exists two numbers and such that for any string of length at least in , and every way of “marking” or more of the positions in , can be written as with string , , , , , such that
- has at least one marked position,
- has at most marked positions, and
- there are 3 strings , , such that
- , , ,
- , , and
- is in for every and for every .
Proof: Similar to the proof of Ogden's lemma, but the subtrees corresponding to the strings and are pruned so that they do not contain any path with twice the same non-terminal (except for the roots of these two subtrees). This necessarily limits the size of the generated strings and by a constant .
The strings and , for , corresponding to an unpruned version of the tree, are used mainly with to simplify the accounting when the lemma is applied.
We modify the above proof attempt by marking the leftmost symbols
0, but they are followed by symbols 0 to make sure that we pump
in the left part of the string, between the two 1's. That make a total of 0's between the 1's (actually would be sufficient, since the rightmost 1 cannot be in , which would allow to simply remove it).
What is left is to have chosen so that we can pump exactly the right number of 0's so that the two sequences are equal. But so far, the only constraint on is to be greater than . And we also know that the number of 0's that are pumped at each pumping is between 1 and q. So let be product of the first integers. We choose .
Hence, since the pumping increment - whatever it is - is in , it divides . Let be the quotient. If we pump exactly times, we get a string which is not in the language. Hence L is not context-free.
I think that I shall never see
A string lovely as a tree.
For if it does not have a parse,
The string is naught but a farce