Non-strict and lazy, while informally interchangeable, apply to different domains of discussion.
Non-strict refers to semantics: the mathematical meaning of an expression. The world to which non-strict applies has no concept of the running time of a function, memory consumption, or even a computer. It simply talks about what kinds of values in the domain map to which kinds of values in the codomain. In particular, a strict function must map the value ⊥ (“bottom” — see the semantics link above for more about this) to ⊥; a non strict function is allowed not to do this.
Lazy refers to operational behavior: the way code is executed on a real computer. Most programmers think of programs operationally, so this is probably what you are thinking. Lazy evaluation refers to implementation using thunks — pointers to code which are replaced with a value the first time they are executed. Notice the non-semantic words here: “pointer”, “first time”, “executed”.
Lazy evaluation gives rise to non-strict semantics, which is why the concepts seem so close together. But as FUZxxl points out, laziness is not the only way to implement non-strict semantics.
If you are interested in learning more about this distinction, I highly recommend the link above. Reading it was a turning point in my conception of the meaning of computer programs.