More of an extended comment with a conjecture, but here is a condition that seems to capture the problem, in the context of regular L for S(L) to be context-free.
Condition
In the minimal DFA A for L, any accepting path contains at most one loop.
Exception: two loops are allowed if their labels and the label of the prefix before the first loop all commute, and the suffix after the second loop is empty. For instance aa∗b(aa)∗ is ok.
Recall that two words u and v commute if they are powers of a same word t. We can assume the suffix empty, because it cannot be non-empty and commute with the label of the second loop in a DFA.
Sufficient
Assume the condition, you build a PDA for L by treating each accepting pattern xuy of A where u labels a simple loop. We want to accept words of the form xunyxuny. We read x, push a symbol for every occurence of u, read yx, then pop a symbol for every occurence of u, and finally read y.
About the exception, if we are in this case, a basic accepting path is of the form xuyv where u,v are the labels of the loops. We accept words of the form xunyvmxunyvm, but by assumption (x,u,v commute) it is the same as unxyunvmxyvm, which can be done by a PDA: push n times (for occurences of u), read xy, pop n times, push m times (for v), read xy, pop m times.
The final PDA is the union of the PDAs for each pattern.
Necessary
(handwaving) If there is a path with two loops, even in the simplest case where you must take one then the other (for instance a∗b∗), you must remember how many times each one is taken, but the stack structure prevents you to repeat them in the same order. Notice that the fact that the DFA is minimal is important in the characterization, to avoid using two loops when one could suffice.
For now the necessary part is only a conjecture, and more exceptions could be needed to get the exact condition, I would be interested in counter-examples.