220 2804 <913f9603-aff2-4abf-a330-d4d0f8a2d1e0@isocpp.org> article
Path: news.gmane.org!not-for-mail
From: Nicol Bolas <jmckesson@gmail.com>
Newsgroups: gmane.comp.lang.c++.isocpp.proposals
Subject: Improvements for string literals
Date: Sun, 10 Feb 2013 18:52:27 -0800 (PST)
Lines: 260
Approved: news@gmane.org
Message-ID: <913f9603-aff2-4abf-a330-d4d0f8a2d1e0@isocpp.org>
Reply-To: std-proposals@isocpp.org
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: multipart/alternative; 
	boundary="----=_Part_139_6368071.1360551147809"
X-Trace: ger.gmane.org 1360551149 22680 80.91.229.3 (11 Feb 2013 02:52:29 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Mon, 11 Feb 2013 02:52:29 +0000 (UTC)
To: std-proposals@isocpp.org
Original-X-From: std-proposals+bncBCEKFTV6ZUMBB3NZ4GEAKGQED6URRMQ@isocpp.org Mon Feb 11 03:52:50 2013
Return-path: <std-proposals+bncBCEKFTV6ZUMBB3NZ4GEAKGQED6URRMQ@isocpp.org>
Envelope-to: gclcip-std-proposals@m.gmane.org
Original-Received: from mail-vb0-f71.google.com ([209.85.212.71])
	by plane.gmane.org with esmtp (Exim 4.69)
	(envelope-from <std-proposals+bncBCEKFTV6ZUMBB3NZ4GEAKGQED6URRMQ@isocpp.org>)
	id 1U4jVh-0006nF-H0
	for gclcip-std-proposals@m.gmane.org; Mon, 11 Feb 2013 03:52:49 +0100
Original-Received: by mail-vb0-f71.google.com with SMTP id p1sf7335036vbi.2
        for <gclcip-std-proposals@m.gmane.org>; Sun, 10 Feb 2013 18:52:29 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20120113;
        h=x-received:x-beenthere:x-received:date:from:to:message-id:subject
         :mime-version:x-original-sender:reply-to:precedence:mailing-list
         :list-id:x-google-group-id:list-post:list-help:list-archive
         :list-subscribe:list-unsubscribe:content-type;
        bh=TLooXKqVcfpnaC2rwSzPWcUrncddQo0laoJalepwkXQ=;
        b=F0x4Q8OsZkB5zw+L2GZKqd5w4ctc0JrlPd9PfqjnpV0Jl82R2S7KAbgoZFZiwuM6D+
         fCpUMSsNwIZHYhzfs+ZoIgOFgsu4LieIOcGhmMT/ictRK3ugFGKrUOKzYvQ5qanhsJF6
         Sj1mb/5a8PNYvAUbuVfICzkP2LupbyhzbNSP0E4YrIm8XTh7jH2xeQF8ZfKI1MoSNlau
         12ZSbRvbtn1LDOws8pnRXWz+uJzU0SrbHiTOSY1DdtKUNTkgWsXQfx0e0v4hPApOKGkJ
         /syoIH8VRwO61TnhMU9gOAewwgGqPhvfGk/2cap5rY+eAuIE4zEhJCdBGpkUpdhTup4V
         qOiA==
X-Received: by 10.224.176.196 with SMTP id bf4mr8391307qab.4.1360551149753;
        Sun, 10 Feb 2013 18:52:29 -0800 (PST)
X-BeenThere: std-proposals@isocpp.org
Original-Received: by 10.49.87.72 with SMTP id v8ls1575937qez.78.gmail; Sun, 10 Feb
 2013 18:52:28 -0800 (PST)
X-Received: by 10.49.127.198 with SMTP id ni6mr839304qeb.23.1360551148223;
        Sun, 10 Feb 2013 18:52:28 -0800 (PST)
X-Original-Sender: jmckesson@gmail.com
Precedence: list
Mailing-list: list std-proposals@isocpp.org; contact std-proposals+owners@isocpp.org
List-ID: <std-proposals.isocpp.org>
X-Google-Group-Id: 399137483710
List-Post: <http://groups.google.com/a/isocpp.org/group/std-proposals/post?hl=en>,
 <mailto:std-proposals@isocpp.org>
List-Help: <http://support.google.com/a/isocpp.org/bin/topic.py?hl=en&topic=25838>,
 <mailto:std-proposals+help@isocpp.org>
List-Archive: <http://groups.google.com/a/isocpp.org/group/std-proposals/?hl=en>
List-Subscribe: <http://groups.google.com/a/isocpp.org/group/std-proposals/subscribe?hl=en>,
 <mailto:std-proposals+subscribe@isocpp.org>
List-Unsubscribe: <http://groups.google.com/a/isocpp.org/group/std-proposals/subscribe?hl=en>,
 <mailto:googlegroups-manage+399137483710+unsubscribe@googlegroups.com>
Xref: news.gmane.org gmane.comp.lang.c++.isocpp.proposals:2804
Archived-At: <http://permalink.gmane.org/gmane.comp.lang.c++.isocpp.proposals/2804>

------=_Part_139_6368071.1360551147809
Content-Type: text/plain; charset=ISO-8859-1

Note: the type `charT` refers to any of the available character types, for 
the various string literals.

It is often useful to want to differentiate between a mere string and a 
string that comes from a string literal. So what I was thinking was that 
string literals should return some specialized type, which can be used 
instead of a `const charT[]`/`const charT*`.

The first hurdle is the definition of this. I don't think it can be a real 
C++ type, because C++ overload resolution rules would play havoc with this. 
Even if the type was implicitly convertible to a `const charT[]` and `const 
charT*`, you can only get one implicit conversion between the source type 
and the destination type. So something as simple as this:

void Func(const std::string &str);

Func("a string");

would not work without modifications to `std::string`. And while we could 
do that with our string classes, we can't do it to othe people's classes. 
So this would be a *massively* breaking change.

Now, we *could* do it with some template metaprogramming magic. Using 
`static if` notation (because I have no stomach for `std::enable_if`:

struct string_literal
{
  template<typename T>
  operator T() const if(is_convertible<const char*, T>::value)
  { return T((const char*)*this) }

  operator const char*() const {...}
};

This would allow the type to be convertible to any T which can be 
implicitly constructed from the literal type.

Another potential pitfall is in template argument deduction contexts. 
Normally, a string literal is always deduced as a `const charT [n]`. So if 
a function is expecting that kind of deduction to work, it could become a 
problem. I'm not sure how to resolve that one; perhaps some specific 
language around template argument deduction of string literals would work.

Anyone think of any other problems?

Now obviously, this is a lot of work, fraught with peril. What is the 
upside?

1: We get the ability to have functions which take values that we are *
certain* are string literals.

2: We can use template metaprogramming to differentiate between someone 
passing any old `const charT*` and an actual string literal. This could be 
useful for std::string implementations, where they could store an allocated 
string or a string_literal, depending on how it was initialized. Obviously 
non-const member accesses will cause a copy of the string, similar to COW, 
but without the general dangers of that construct.

3: We can do useful tricks. For example, it would be great to introduce a 
char8_t for UTF-8 encoded literals. But because we didn't do that for 
C++11, we can't have C++14 simply declare that `u8""` must become a `const 
char8_t[]`; it would break any code that currently expects it to be a 
`const char[]`. However, we can declare that the type `u8""` returns can be 
implicitly converted into both of them. But after the conversion is done, 
you can't convert between them. So the direct use of the literal expression 
will result in something special that picks one from the other. Namely, 
this type.

As for the class interface itself, I have no problem with forcing you to 
convert it into a `string_ref` (or whatever we're calling it) in order to 
access any functionality more advanced than:

constexpr size_t size() noexcept;
constexpr bool empty() noexcept;
const_reference charT *operator[];
constexpr const charT *data() noexcept;

So... good? Bad?

-- 

--- 
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-proposals@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/?hl=en.



------=_Part_139_6368071.1360551147809
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Note: the type `charT` refers to any of the available character types, for =
the various string literals.<br><br>It is often useful to want to different=
iate between a mere string and a string that comes from a string literal. S=
o what I was thinking was that string literals should return some specializ=
ed type, which can be used instead of a `const charT[]`/`const charT*`.<br>=
<br>The first hurdle is the definition of this. I don't think it can be a r=
eal C++ type, because C++ overload resolution rules would play havoc with t=
his. Even if the type was implicitly convertible to a `const charT[]` and `=
const charT*`, you can only get one implicit conversion between the source =
type and the destination type. So something as simple as this:<br><br><div =
class=3D"prettyprint" style=3D"background-color: rgb(250, 250, 250); border=
-color: rgb(187, 187, 187); border-style: solid; border-width: 1px; word-wr=
ap: break-word;"><code class=3D"prettyprint"><div class=3D"subprettyprint">=
<span style=3D"color: #008;" class=3D"styled-by-prettify">void</span><span =
style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"=
color: #606;" class=3D"styled-by-prettify">Func</span><span style=3D"color:=
 #660;" class=3D"styled-by-prettify">(</span><span style=3D"color: #008;" c=
lass=3D"styled-by-prettify">const</span><span style=3D"color: #000;" class=
=3D"styled-by-prettify"> std</span><span style=3D"color: #660;" class=3D"st=
yled-by-prettify">::</span><span style=3D"color: #008;" class=3D"styled-by-=
prettify">string</span><span style=3D"color: #000;" class=3D"styled-by-pret=
tify"> </span><span style=3D"color: #660;" class=3D"styled-by-prettify">&am=
p;</span><span style=3D"color: #000;" class=3D"styled-by-prettify">str</spa=
n><span style=3D"color: #660;" class=3D"styled-by-prettify">);</span><span =
style=3D"color: #000;" class=3D"styled-by-prettify"><br><br></span><span st=
yle=3D"color: #606;" class=3D"styled-by-prettify">Func</span><span style=3D=
"color: #660;" class=3D"styled-by-prettify">(</span><span style=3D"color: #=
080;" class=3D"styled-by-prettify">"a string"</span><span style=3D"color: #=
660;" class=3D"styled-by-prettify">);</span></div></code></div><br>would no=
t work without modifications to `std::string`. And while we could do that w=
ith our string classes, we can't do it to othe people's classes. So this wo=
uld be a <i>massively</i> breaking change.<br><br>Now, we <i>could</i> do i=
t with some template metaprogramming magic. Using `static if` notation (bec=
ause I have no stomach for `std::enable_if`:<br><br><div class=3D"prettypri=
nt" style=3D"background-color: rgb(250, 250, 250); border-color: rgb(187, 1=
87, 187); border-style: solid; border-width: 1px; word-wrap: break-word;"><=
code class=3D"prettyprint"><div class=3D"subprettyprint"><span style=3D"col=
or: #008;" class=3D"styled-by-prettify">struct</span><span style=3D"color: =
#000;" class=3D"styled-by-prettify"> string_literal<br></span><span style=
=3D"color: #660;" class=3D"styled-by-prettify">{</span><span style=3D"color=
: #000;" class=3D"styled-by-prettify"><br>&nbsp; </span><span style=3D"colo=
r: #008;" class=3D"styled-by-prettify">template</span><span style=3D"color:=
 #660;" class=3D"styled-by-prettify">&lt;</span><span style=3D"color: #008;=
" class=3D"styled-by-prettify">typename</span><span style=3D"color: #000;" =
class=3D"styled-by-prettify"> T</span><span style=3D"color: #660;" class=3D=
"styled-by-prettify">&gt;</span><span style=3D"color: #000;" class=3D"style=
d-by-prettify"><br>&nbsp; </span><span style=3D"color: #008;" class=3D"styl=
ed-by-prettify">operator</span><span style=3D"color: #000;" class=3D"styled=
-by-prettify"> T</span><span style=3D"color: #660;" class=3D"styled-by-pret=
tify">()</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> <=
/span><span style=3D"color: #008;" class=3D"styled-by-prettify">const</span=
><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span st=
yle=3D"color: #008;" class=3D"styled-by-prettify">if</span><span style=3D"c=
olor: #660;" class=3D"styled-by-prettify">(</span><span style=3D"color: #00=
0;" class=3D"styled-by-prettify">is_convertible</span><span style=3D"color:=
 #660;" class=3D"styled-by-prettify">&lt;</span><span style=3D"color: #008;=
" class=3D"styled-by-prettify">const</span><span style=3D"color: #000;" cla=
ss=3D"styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"sty=
led-by-prettify">char</span><span style=3D"color: #660;" class=3D"styled-by=
-prettify">*,</span><span style=3D"color: #000;" class=3D"styled-by-prettif=
y"> T</span><span style=3D"color: #660;" class=3D"styled-by-prettify">&gt;:=
:</span><span style=3D"color: #000;" class=3D"styled-by-prettify">value</sp=
an><span style=3D"color: #660;" class=3D"styled-by-prettify">)</span><span =
style=3D"color: #000;" class=3D"styled-by-prettify"><br>&nbsp; </span><span=
 style=3D"color: #660;" class=3D"styled-by-prettify">{</span><span style=3D=
"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color: #=
008;" class=3D"styled-by-prettify">return</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify"> T</span><span style=3D"color: #660;" class=
=3D"styled-by-prettify">((</span><span style=3D"color: #008;" class=3D"styl=
ed-by-prettify">const</span><span style=3D"color: #000;" class=3D"styled-by=
-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by-prettify=
">char</span><span style=3D"color: #660;" class=3D"styled-by-prettify">*)*<=
/span><span style=3D"color: #008;" class=3D"styled-by-prettify">this</span>=
<span style=3D"color: #660;" class=3D"styled-by-prettify">)</span><span sty=
le=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"col=
or: #660;" class=3D"styled-by-prettify">}</span><span style=3D"color: #000;=
" class=3D"styled-by-prettify"><br><br>&nbsp; </span><span style=3D"color: =
#008;" class=3D"styled-by-prettify">operator</span><span style=3D"color: #0=
00;" class=3D"styled-by-prettify"> </span><span style=3D"color: #008;" clas=
s=3D"styled-by-prettify">const</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> </span><span style=3D"color: #008;" class=3D"styled-by=
-prettify">char</span><span style=3D"color: #660;" class=3D"styled-by-prett=
ify">*()</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> <=
/span><span style=3D"color: #008;" class=3D"styled-by-prettify">const</span=
><span style=3D"color: #000;" class=3D"styled-by-prettify"> </span><span st=
yle=3D"color: #660;" class=3D"styled-by-prettify">{...}</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify"><br></span><span style=3D"co=
lor: #660;" class=3D"styled-by-prettify">};</span><span style=3D"color: #00=
0;" class=3D"styled-by-prettify"><br></span></div></code></div><br>This wou=
ld allow the type to be convertible to any T which can be implicitly constr=
ucted from the literal type.<br><br>Another potential pitfall is in templat=
e argument deduction contexts. Normally, a string literal is always deduced=
 as a `const charT [n]`. So if a function is expecting that kind of deducti=
on to work, it could become a problem. I'm not sure how to resolve that one=
; perhaps some specific language around template argument deduction of stri=
ng literals would work.<br><br>Anyone think of any other problems?<br><br>N=
ow obviously, this is a lot of work, fraught with peril. What is the upside=
?<br><br>1: We get the ability to have functions which take values that we =
are <i>certain</i> are string literals.<br><br>2: We can use template metap=
rogramming to differentiate between someone passing any old `const charT*` =
and an actual string literal. This could be useful for std::string implemen=
tations, where they could store an allocated string or a string_literal, de=
pending on how it was initialized. Obviously non-const member accesses will=
 cause a copy of the string, similar to COW, but without the general danger=
s of that construct.<br><br>3: We can do useful tricks. For example, it wou=
ld be great to introduce a char8_t for UTF-8 encoded literals. But because =
we didn't do that for C++11, we can't have C++14 simply declare that `u8""`=
 must become a `const char8_t[]`; it would break any code that currently ex=
pects it to be a `const char[]`. However, we can declare that the type `u8"=
"` returns can be implicitly converted into both of them. But after the con=
version is done, you can't convert between them. So the direct use of the l=
iteral expression will result in something special that picks one from the =
other. Namely, this type.<br><br>As for the class interface itself, I have =
no problem with forcing you to convert it into a `string_ref` (or whatever =
we're calling it) in order to access any functionality more advanced than:<=
br><br><div class=3D"prettyprint" style=3D"background-color: rgb(250, 250, =
250); border-color: rgb(187, 187, 187); border-style: solid; border-width: =
1px; word-wrap: break-word;"><code class=3D"prettyprint"><div class=3D"subp=
rettyprint"><span style=3D"color: #008;" class=3D"styled-by-prettify">const=
expr</span><span style=3D"color: #000;" class=3D"styled-by-prettify"> size_=
t size</span><span style=3D"color: #660;" class=3D"styled-by-prettify">()</=
span><span style=3D"color: #000;" class=3D"styled-by-prettify"> noexcept</s=
pan><span style=3D"color: #660;" class=3D"styled-by-prettify">;</span><span=
 style=3D"color: #000;" class=3D"styled-by-prettify"><br></span><span style=
=3D"color: #008;" class=3D"styled-by-prettify">constexpr</span><span style=
=3D"color: #000;" class=3D"styled-by-prettify"> </span><span style=3D"color=
: #008;" class=3D"styled-by-prettify">bool</span><span style=3D"color: #000=
;" class=3D"styled-by-prettify"> empty</span><span style=3D"color: #660;" c=
lass=3D"styled-by-prettify">()</span><span style=3D"color: #000;" class=3D"=
styled-by-prettify"> noexcept</span><span style=3D"color: #660;" class=3D"s=
tyled-by-prettify">;</span><span style=3D"color: #000;" class=3D"styled-by-=
prettify"><br>const_reference charT </span><span style=3D"color: #660;" cla=
ss=3D"styled-by-prettify">*</span><span style=3D"color: #008;" class=3D"sty=
led-by-prettify">operator</span><span style=3D"color: #660;" class=3D"style=
d-by-prettify">[];</span><span style=3D"color: #000;" class=3D"styled-by-pr=
ettify"><br></span><span style=3D"color: #008;" class=3D"styled-by-prettify=
">constexpr</span><span style=3D"color: #000;" class=3D"styled-by-prettify"=
> </span><span style=3D"color: #008;" class=3D"styled-by-prettify">const</s=
pan><span style=3D"color: #000;" class=3D"styled-by-prettify"> charT </span=
><span style=3D"color: #660;" class=3D"styled-by-prettify">*</span><span st=
yle=3D"color: #000;" class=3D"styled-by-prettify">data</span><span style=3D=
"color: #660;" class=3D"styled-by-prettify">()</span><span style=3D"color: =
#000;" class=3D"styled-by-prettify"> noexcept</span><span style=3D"color: #=
660;" class=3D"styled-by-prettify">;</span></div></code></div><br>So... goo=
d? Bad?<br>

<p></p>

-- <br />
&nbsp;<br />
--- <br />
You received this message because you are subscribed to the Google Groups &=
quot;ISO C++ Standard - Future Proposals&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an e=
mail to std-proposals+unsubscribe@isocpp.org.<br />
To post to this group, send email to std-proposals@isocpp.org.<br />
Visit this group at <a href=3D"http://groups.google.com/a/isocpp.org/group/=
std-proposals/?hl=3Den">http://groups.google.com/a/isocpp.org/group/std-pro=
posals/?hl=3Den</a>.<br />
&nbsp;<br />
&nbsp;<br />

------=_Part_139_6368071.1360551147809--

.
